Responsive image
博碩士論文 etd-0718100-142354 詳細資訊
Title page for etd-0718100-142354
論文名稱
Title
資料相依分析及其在迴圈轉換之應用
Data Dependence Analysis and Its Applicatons on Loop Transformation
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
109
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2000-07-15
繳交日期
Date of Submission
2000-07-18
關鍵字
Keywords
平行編譯器、資料相依分析
Data Dependence Analysis, Parallel Compiler
統計
Statistics
本論文已被瀏覽 5840 次,被下載 1339
The thesis/dissertation has been browsed 5840 times, has been downloaded 1339 times.
中文摘要
摘要
近幾十年來,平行處理已成為計算機界裡重要的研究領域。根據統計處理機執行一個數值程式,大部份時間皆花費於迴圈執行上,因此,若能將循序程式透過平行編譯器(Parallel Compiler)做迴圈重組,使其能在向量處理機或平行機器上執行時能善用機器上之平行特性,將會有效改善程式之執行效率。平行編譯器將循序程式編譯成向量碼或平行碼時,最重要的工作是分析每個敘述(Statement)之間變數使用的相依性,依據分析結果,提供迴圈重組的訊息(information)。
資料相依分析是決定一個迴圈可否向量化或平行化的必須的步驟,它判斷在迴圈內是否會存取到相同之陣列元素或變數(即在相同的記憶體位址做存取)。近年來,平行編譯器的研究相當多,但是,資料相依測試一直是個難題,國外有許多著名的資料相依測試法,如Banerjee Test, test, Omega Test, I Test, Power Test, ... 等,這些較精確的測試法,常被使用在平行編譯器設計上之資料相依測試,本論文中,我們將提出一種新的精確資料相依測試法稱為整數邊限縮減測試法(IR test),此法是將Constraints中之每個變數的整數邊限反覆做投影縮減,當某一變數之有效區間縮減為空集合時,則此Constraints無整數解,而此變數間記憶體之存取也就互不相依。
整數邊限縮減測試法在平行編譯時只能使用在迴圈邊限是長方行、三角形、或邊限未可知的條件限制。在本論文,我們將提出一個方法稱為擴充-整數邊限縮減測試法(Extension-IR test),使一維陣列參考的相依測試範圍擴充到線性註標的變數含有方向向量條件。以擴充整數邊限縮減測試法的使用範圍。擴充-整數邊限縮減測試法能在有限的時間下有效的測出資料相依性。
無論如何,並非所有的資料相依皆可測出,例如,當陣列是非線性數學式或複雜到無法使用目前已經發展的測試法。在本論文,我們發展一種新的平行演算法,稱為非線性陣列註標測試法(NLA test),用以測試非線性陣列註標。此方法將含有迴圈傳送(loop-carried)相依的重述(iteration)找出,並排定在不同的波前(wavefronts)。無迴圈傳送相依的重述排定在相同的波前。使用這些波前的資料將迴圈轉為平行碼以利執行。
迴圈交換是一個迴圈向量化、平行化的重要迴圈重組的技巧。在本論文,我們發展一種技巧,對於判斷兩個不相鄰迴圈是否可以交換,有較佳的結果。我們也發展一種方法,對於判斷在不完全巢式迴圈中的兩層迴圈是否可以直接交換。我們也提出一種方法,用於判斷含有IF 和 GOTO 敘述(statement)的完全巢式迴圈中的兩層迴圈是否可以直接交換。

Abstract
For the past several decades, parallel processing has become an important research subject in the computer science area. According to the statistics, in executing a numerical program, most of time is spent on the loops. If we can use the technique of loop restructuring in the parallelizing compiler such that the conventional sequential program can be executed by exploiting the characteristics of vector machine or parallel machine, the execution efficiency will be greatly improved. In the parallelizing compiler, data dependence analysis is very important because it provides the information for loop restructuring.
Data dependence analysis is necessary in order to determine whether a loop can be vectorized or parallelized. It analyzes whether the same array element or variable will be accessed more than once in a loop (e.g. access the same memory location more than once in loop execution). In the recent years, the researches on parallelizing compiler are considerable. But, data dependence analysis is still a bottleneck. There are many data dependence test such as Banerjee Test, test, Omega Test, I Test, Power Test, ... and so on, which have been used in the design of parallelizing compiler. In the thesis, we will propose a novel exact data dependence test method called Interval Reduced test (IR test). This method reduces the integer boundary of each constraint variable by repeatedly projection. When the effective region of a variable is reduced to be empty, the constraint containing this variable has no integer solution and the memory accesses under this constraint are therefore independent.
The IR test is only suitable for the loops in which the loop bounds are rectangular, triangular, or unknown at compiling-time in some limited condition. To enhance the data dependence analysis capability of the IR test, we proposed the Extension-IR test in this thesis to extend the dependence testing range of one-dimensional array references to linear subscripts with variable bounds under any given direction vector. The Extension-IR test can solve in effective polynomial time.
When array subscripts are non-linear expressions or too complex to analyze by the existing data dependence testing schemes, we devise a new parallelization algorithm called non-linear array subscripts test (NLA test) to deal with. The iterations subject to loop-carried dependence are scheduled into different wavefronts, while the iterations with no loop-carried dependence are assigned into the same wavefront. Based on the wavefront information, the original loop is transformed into parallel code for execution at run-time.
Loop interchange is an important restructuring technique for supporting vectorization and parallelization. In this thesis, we proposed a technique, which can determine efficiently, whether loops can be interchanged between two non-adjacent loops on perfect nested loop or some imperfectly nested loop. A method for determining whether two arbitrary levels in perfectly nested loops, which contain IF and GOTO statements, can be interchanged is also presented in this thesis.

目次 Table of Contents
ACKNOWLEDGEMENTS I
摘要 II
ABSTRACT IV
LIST OF TABLES VIII
LIST OF FIGURES IX
CHAPTER 1 INTRODUCTION 1
1.1 DATA DEPENDENCE 1
1.2 RELATED WORK 4
1.2.1 The GCD and Banerjee Test 5
1.2.2 The I Test 5
1.2.3 The Lambda Test 7
1.2.4 The Power Test 8
1.2.5 The Omega Test 10
1.3 THESIS ORGANIZATION 12
CHAPTER 2 THE IR TEST: A NEW DATA DEPENDENCE ANALYSIS FOR ARRAY REFERENCES 13
2.1 PROBLEM DEFINITION 13
2.2 IR TEST FOR TWO-VARIABLE DEPENDENCE EQUATION 15
2.2.1 Rectangular Bound 17
2.2.2 Triangular Bound 24
2.3 DEPENDENCE EQUATION IN MORE THAN TWO VARIABLES 25
2.4 EXPERIMENTAL RESULTS 31
CHAPTER 3 DEPENDENCE ANALYSIS WITH DIRECTION VECTOR FOR ARRAY REFERENCES 33
3.1 PROBLEM DEFINITION 33
3.2 THE EXTENSION-IR TEST 36
3.2.1 Bounds in Triangles 38
3.2.2 Diophantine Equations in Many Variables 39
3.3 COMPARISON AND EXPERIMENTAL RESULTS 45
CHAPTER 4 A LOOP PARALLELIZATION TECHNIQUE FOR NONLINEAR EXPRESSION 47
4.1 PROBLEM DEFINITION 47
4.1.1 Data dependence 48
4.2 ILLUSTRATION OF NLA TEST 49
4.3 THE NLA TEST 52
4.4 EXAMPLES 54
4.5 EXPERIMENTAL RESULTS 57
4.5.1 Evaluation for artificial synthetic loops 57
4.5.2 Evaluation for the real programs 58
4.6 THE BARRIER OVERHEADS OF NLA TEST 60
CHAPTER 5 ADVANCED LOOP INTERCHANGE 62
5.1 BACKGROUND 62
5.2 LOOP INTERCHANGE FOR NONADJACENT LOOPS 64
5.3 LOOP INTERCHANGE FOR IMPERFECTLY NESTED LOOPS 67
5.4 EXPERIMENTAL RESULTS 74
5.5 LOOP INTERCHANGE WITH LOOPS CONTAINING CONTROL DEPENDENCE 75
5.5.1 Loop Containing Only IF-statement 77
5.5.2 Loop Containing Both IF and GOTO Statements 83
5.6 DISCUSSIONS 86
CHAPTER 6 CONCLUSIONS 88
BIBLIOGRAPHY 90
VITA 95
PUBLICATION LIST 96


參考文獻 References
Allen, J. R., 1983. Data Analysis for Subscripted Variable and Its Application to Program Transformations. Ph.D. Dissertation, Department of Mathematical Sciences, Rice University, Houston TX.
Allen, J. R., Kennedy, K., Porterfield, C., and Wareen, J., 1983. Conversion of control dependence to data dependence. In Conf. Rec. 10th ACM Sym, principle of Programming Languages (POPL), pp.177-189.
Allen, J. R., and Kennedy, K., 1984. Automatic Loop Interchange. Proceedings of the SIGPLAN '84 Symposium on Compiler Construction, Montreal, Canada. pp. 233-246.
Allen, J. R., and Kennedy, K., 1987. Automatic Translation of FORTRAN programs to Vector form. ACM TOPLAS, Vol.9, pp. 491-542.
Banerjee, U., 1976. Data dependence in ordinary programs. M. S. thesis, University of Illinois at Urbana-Champaign.
Banerjee, U., 1988. Dependence analysis for supercomputing. Kluwer Academic Publishers, Boston, Mass.
Banerjee, U., 1993. Loop Transformations for Restructuring Compilers: The Foundations. Kluwer Academic Publishers, Boston, Mass.
Banerjee, U., 1994. Loop Parallelization. Kluwer Academic Publishers.
Banerjee, U., 1997. Dependence analysis. Kluwer Academic Publishers.
Berry, M., Chen, D., Koss, P., Kuck, D., Pointer, L., Lo, S., Pang, Y., Roloff, R., Sameh, A., Clementi, E., Chin, S., Schneider, D., Fox, G., Messina, P., Walker, D., Hsiung, C., Schwarzmeier, J., Lue, K., Orszag, K., Seidl, F., Johnson, O., Swanson, G., Goodrum, R., and Martin, J., 1989. The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers. Int'l. Journal of Supercomputer Applications, 3(3): 5-40.
Blume, W., and Eigenmann, R., 1992. Performance analysis of parallelizing compilers on the perfect benchmarks program. IEEE Trans. On Parallel and Distributed Systems, Vol. 3, No. 6, pp.643-656.
Blume, W., and Eigenmann, R., 1994. The Range Test: A Dependence Test for Symbolic, Non-linear Expressions. IEEE Supercomputing. (Washington D. C.), pp. 528-537.
Blume, W., and Eigenmann, R., 1998. Nonlinear and Symbolic Data Dependence Testing. IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 12, pp. 1180-1194.
Chang, W. L., 1999. Improvement of Data Dependence Testing and Data Alignment Method in Parallel Compilers. Ph.D. Dissertation, Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan.
Chang, C. Y., 1995. Design and Implementation of Compilation Techniques for Vector Computers. Ph.D. Dissertation, Department of Computer Science and Information Engineering, National Central University, Taiwan.
Chen, D. K., Oesterreich, D. A., Torrellas, J., and Yew, P. C., 1994. An Efficient Algorithm for the Run-Time Parallelization of Doacross Loops. In Proc. the 1994 Supercomputing, pp. 518-527.
Chen, D. K., 1994. Compiler Optimizations for Parallel Loops with Fin-grained Synchronization. Ph.D. Dissertation, University of Illinois at Urbana-Champaign.
Goff, G., Kennedy, K., and Tseng, C. W., 1991. Practical dependence testing. In Proc. of the ACM SIGPLAN '91 Conf. on Prog. Lang. Design and Implementation. Toronto, Ontario, Canada, pp. 15-29.
Haghighat, M. R., 1995. Symbolic Analysis for Parallelizing Compilers. Kluwer Academic Publishers.
Hsu, P. H., 1999. Run-Time Parallelization Techniques for Irreqular Scientific Computations on Shared-Memory Machines. Ph.D. Dissertation, Department of Electrical Engineering, National Sun Yat-Sen University, Taiwan.
Huang, T. C., Yang, C. M., and Yang, C. S., 1995. Loop Interchange with Loops Containing Control Dependence. Proceedings of The First Workshop on Compiler Techniques for High Performance Computing, pp. 30-41.
Huang, T. C., Yang, C. M., and Chang, C. Y., 1996. A New Approach to Data Dependence Analysis and its Implementation - The Interval Reduction Test. Proceedings of The Second Workshop on Compiler Techniques for High-Performance Computing, pp.107-113.
Huang, T. C., and Yang, C. M., 1996. An Exact Data Dependence Analysis for Array Reference: The IR Test. The 10th Annual International Conference on High Performance Computers, Ottawa, Canada, pp. 1-24.
Huang, T. C., and Hsu, P. H., 1997. The SPNT Test: A New Technology for Run-Time Speculative Parallelization of Loops, The 10th International Workshop on Languages and Compilers for Parallel Computing, pp. 177-191.
Huang, T. C., and Yang, C. M., 1998. Further Results for Improving Loop Interchange in Non-adjacent and Imperfectly Nested Loops. Proceedings of Third International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS’98), Orlando, Florida, pp.93-99.
Huang, T. C., and Yang, C. M., 1999. Data Dependence Analysis for Array References. The Journal of Systems and Software, 1999 (publication).
Kennedy, K., and Mckinley, K. S., 1990. Loop Distribution with Arbitrary control Flow. Proceeding of Supercomputing '90, pp. 407-416, New York.
Kong, X., Klappholz, D., and Psarriss, K., 1991. The I test: An improved dependence test for automatic paralleization and vectorization. IEEE Trans. on Parallel and Distributed Systems, Vol. 2, No. 3, pp. 342-349.
Kong X., Klappholz D. and Psarriss K., 1993. The Direction Vector I Test. IEEE Trans. on Parallel and Distributed System, Vol. 4, No. 11, pp. 1280-1290.
Kuck, D. J., Kuhn, R. H., Padua, D. A., Leasure, B. R., and Wolfe, M. M., 1981. Dependence graphs and compiler optimizations. In Conf. Rec. 8th ACM Sym., Principles of programming Languages (POPL), pp.207-18.
Li, Z., Yew, P. C., and Zhu, C. Q., 1989. Data dependence analysis on multi-dimensional array reference. Int. Conference on Supercomputing, pp. 86-95.
Li, Z., Yew, P. C., and Zhu, C. Q., 1990. An efficient data dependence analysis for parallelizing compilers. IEEE Trans. on Parallel and Distributed Systems, Vol. 1, No. 1, pp. 26-34.
Maydan, D. E., Hennessy, J. L., and Lam, M. S., 1991. Efficient and extract data dependence analysis, In Proc. of the ACM SIGPLAN '91 Conf. on Prog. Lang. Design and Implementation. Toronto, Ontario, Canada, pp. 1-14.
Maydan, D. E., 1992. Accurate Analysis of Array References. Ph.D. Dissertation, Department of Computer Science, Stanford University.
Padua, D. A., 1996. Outline of a Roadmap for Compiler Technology. CSRD Rep. 1489, Univ. Illinois, Urbana-Champaign.
Patel, D., and Rauchwerger, L., 1998. Principles of Compiler Integration of Speculative Run-Time Parallelization. The 11th International Workshop on Languages and Compilers for Parallel Computing, pp. 330-351.
Petersen, P. M., and Padua, D. A., 1996. Static and Dynamic Evaluation of Data Dependence Analysis Techniques. IEEE Transactions on Parallel and Distributed Systems, vol. 7, no. 11, pp. 1121-1132.
Pugh, W., 1992. A practical algorithm for extract array dependence analysis. Communication of the ACM, Vol. 35, No. 8, pp. 102-114.
Rauchwerger, L., 1995. Run-Time Parallelization: A Framework for Parallel Computation. Ph.D. Dissertation, Univ. Illinois, Urbana-Champaign.
Sage++, 1993. Sage++: a class library for building FORTRAN 90 and C++ restructuring tools. Indiana university.
Saltz, J. H., Mirchandaney, R., and Crowley, K., 1991. Run-Time Parallelization and Scheduling of Loops. IEEE Transactions on Computers, vol. 40, no. 5, pp. 603-612.
Shen, Z., Li, Z., and Yew, P. C., 1992. An empirical study of fortran programs for parallelizing compilers. IEEE Trans. on Parallel and Distributed Systems, Vol. 1, No. 3, pp. 356-364.
Smith, B., 1976. Matrix Eigensystem Routines-Eispack Guide. Heidelberg: Springer.
Subhlok, J., and Kennedy, K., 1995. Integer programming for array subscript analysis. IEEE Trans. on Parallel and Distributed System, Vol. 6, No. 6, pp. 662-668.
Towle R. A., 1976. Control and data dependence for Program Transformation. Ph. D. thesis, University of Illinois at Urbana-Champaign.
Triolet, R., Irigoin, J. F., and Feautrier, P., 1986. Direct parallelization of call statements. In Proc. of the ACM SIGPLAN Symp. Compiler Construction, Palo Alto, pp. 176-185.
Wolfe, M. J., 1982. Optimizing Supercompilers for Supercomputers. Ph.D. Dissertation, Technical Report 82-1009, Department of Computer Science, University of Illinois at Urbana-Champagian.
Wolfe, M. J., 1986. Advanced Loop Interchanging. Proceedings of the 1986 International Conference on Parallel Proceeding, Charles, Illinois, pp. 536-543.
Wolfe, M., and Tseng, C. W., 1992. The power test for data dependence. IEEE Trans. on Parallel and Distributed System, Vol. 3, No. 5, pp. 591-601.
Wolfe, M., 1996. High Performance Compilers for Parallel Computing. New York MA: Addison-Wesley.
Yang, C. S., Chang, C. C., Yang, C. M., Huang, T. C., and Huang, K. C., 1993. Exact and Efficient Advanced Loop Interchange. Microprocessing and Microprogramming, vol. 38, no. 1-5, pp.421-428.
Zima, H. P., and Chapman, B., 1991. Supercompilers for parallel and Vector computers. New York MA: Addison-Wesley.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code