Responsive image
博碩士論文 etd-0706100-132843 詳細資訊
Title page for etd-0706100-132843
論文名稱
Title
一種在共享記憶體多處理器系統中之執行時期迴圈平行化技術
A Run-Time Loop Parallelization Technique on Shared-Memory Multiprocessor Systems
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
49
研究生
Author
指導教授
Advisor
召集委員
Convenor

口試委員
Advisory Committee
口試日期
Date of Exam
2000-06-24
繳交日期
Date of Submission
2000-07-06
關鍵字
Keywords
多處理器系統、平行編譯器、波前排程、執行時期平行法
Run-time parallelization, Parallelizing compiler, Multiprocessor system, Wavefront scheduling
統計
Statistics
本論文已被瀏覽 5720 次,被下載 1452
The thesis/dissertation has been browsed 5720 times, has been downloaded 1452 times.
中文摘要
目前的科學計算必須要仰賴高效能的計算能力,而多處理器系統(Multiprocessor System)的平行計算則為此高效能計算能力的主要來源。平行編譯器(Parallel Compiler)的功能在於將輸入的循序程式轉譯成在多處理器機器上執行的平行程式,其主要是藉著檢查程式迴圈(Loop)中的資料相依性(Data dependence)來決定可平行執行的迴圈。而在迴圈的陣列存取中若遇到非常態(Irregular)、非線性(Nonlinear)或是動態的存取樣式(Dynamic access pattern),平行編譯器則無法在編譯時期(Compiler-time)中決定迴圈的資料相依性,因此我們必須運用執行時期(Run-time)的演算法來找出這些迴圈中的資料相依,並且萃取出迴圈的平行度。在本論文中,我們提出了一個有效的執行時期平行化技術,針對上述形式的迴圈,排程出平行執行的順序。在這個新技術中,我們使用平行技術首先找出迴圈中每一個輪次(Iteration)的立即先行者輪次(Immediate predecessor iteration)資訊,然後我們參考這些資訊能夠快速而且有效的使用平行技術將輪次排入適當的波前(Wavefront),如此根據所排出之波前順序便可以將迴圈平行的執行。經過理論上的分析及實驗的結果,本論文所提出之新技術具有高增速(Speedup)及低負荷(Overhead)的優點。同時,因其具有高的可調性(Scalability)之特性,更適合使用於多處理器的系統當中。
Abstract
High performance computing power is important for the current advanced calculations of scientific applications. A multiprocessor system obtains its high performance from the fact that some computations can proceed in parallel. A parallelizing compiler can take a sequential program as input and automatically translate it into parallel form for the target multiprocessor system. But when loops with arrays of irregular, nonlinear or dynamic access patterns, no any current parallelizing compiler can determine whether data dependences exist at compile-time. Thus a run-time parallel algorithm will be utilized to determine dependences and extract the potential parallelism of loops. In this thesis, we propose an efficient run-time parallelization technique to compute a proper parallel execution schedule in those loops. This new method first detects immediate predecessor iterations of each loop iteration and constructs an immediate predecessor table, then efficiently schedules the whole loop iterations into wavefronts for parallel execution. According to either theoretical analysis or experimental results, our new run-time parallelization technique reveals high speedup and low processing overhead. Furthermore, this new technique is appropriate to implement on multiprocessor systems due to the characteristics of high scalability.
目次 Table of Contents
目錄
中文摘要 i
英文摘要 iii
目錄 v
圖目錄 vii
表目錄 ix
第1章 簡介 1
第1.1節 問題描述 1
第1.2節 執行時期迴圈排程技術探討 3
第1.3節 設計策略考量 6
第1.4節 論文架構 7
第2章 相關研究 8
第3章 一個新的執行時期迴圈平行化方法 16
第3.1節 動機 16
第3.2節 範例說明 21
第3.3節 偵測器階段(Inspector) 23
第3.4節 排程器階段(Scheduler) 31
第3.5節 執行器階段(Executor) 34
第4章 實驗結果 35
第4.1節 實驗環境 35
第4.2節 實驗數據 36
第4.3節 相關研究比較 43
第5章 結論 45
參考文獻 47

參考文獻 References
[1]Chen, D. K., Yew, P. C., and Torrellas, J., “An efficient algorithm for the run-time parallelization of doacross loops,” in Proc. 1994 Supercomputing, 518-527, 1994.
[2]Huang, T. C. and Hsu, P. H., “The SPNT test: a new technology for run-time speculative parallelization of loops,” in Lecture Notes in Computer Science No. 1366 (Z. Li, P. C. Yew, S. Chatterjee, C. H. Huang, P. Sadayappan, and D. Sehr, eds.), 177-191, Springer-Verlag, 1998.
[3]Huang, T. C., Hsu, P. H., and Sheng, T. N., “Efficient run-time scheduling for parallelizing partially parallel loops,” J. Information Science and Engineering 14(1), 255-264, 1998.
[4]Leung, S. T. and Zahorjan, J., “Improving the performance of run-time parallelization,” in Proc. 4th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, 83-91, 1993.
[5]Leung, S. T. and Zahorjan, J., “Extending the applicability and improving the performance of run-time parallelization,” Tech. Rep. 95-01-08, Dept. CSE, Univ. of Washington, 1995.
[6]Midkiff, S. and Padua, D., “Compiler algorithms for synchronization,” IEEE Trans. Comput. C-36, 12, 1485-1495, 1987.
[7]Polychronopoulos, C., “Compiler optimizations for enhancing parallelism and their impact on architecture design,” IEEE Trans. Comput. C-37, 8, 991-1004, 1988.
[8]Rauchwerger, L., Amato, N., and Padua, D., “A scalable method for run-time loop parallelization,” Int. J. Parallel Processing, 26(6), 537-576, 1995.
[9]Rauchwerger, L. and Padua, D., “The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization,” IEEE Trans. Parallel and Distributed Systems; 10(2), 160-180, 1999.
[10]Saltz, J., Mirchandaney, R., and Crowley, K., “Run-time parallelization and scheduling of loops,” IEEE Trans. Comput. 40(5), 603-612, 1991.
11.Xu, C. and Chaudhary, V., “Time-stamping algorithms for parallelization of loops at run-time,” in Proc. 11th Int. Parallel Processing Symp. 1997.
[12]Zhu, C. Q. and Yew, P. C., “A scheme to enforce data dependence on large multiprocessor systems,” IEEE Trans. Software Eng. 13(6), 726-739, 1987.
[13]Z. Shen, Z. Li, and P. C. Yew, “An empirical study on array subscripts and data dependencies,” in Proc. Of ICPP, pp. II-145 to II-152, 1989.
[14]Lawrence Rauchwerger, “Run-time parallelization: A framework for parallel computation,” A thesis for the degree of Doctor of Philosophy in Computer Science in the Graduate College of the Univ. of Illinois, 1995.
[15]H. Zima, Supercompilers for Parallel and Vector Computer. ACM press, New York, 1991.
[16]Michael Wolfe, High Performance Compilers for Parallel Computing. Addition-Wesley Publish, 1996.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code