Responsive image
博碩士論文 etd-0629112-150235 詳細資訊
Title page for etd-0629112-150235
論文名稱
Title
在向量架構上使用同質暫存簇的高效能暫存分配器
A High Performance Register Allocator for Vector Architectures with a Unified Register-Set
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
43
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2012-02-14
繳交日期
Date of Submission
2012-06-29
關鍵字
Keywords
編譯器最佳化、單一暫存簇、暫存器分配、指令排程、新式GPU、向量架構
instruction scheduling, register allocator, compiler optimization, unified register set, vector architecture, novel Graphics Processing Unit
統計
Statistics
本論文已被瀏覽 5649 次,被下載 759
The thesis/dissertation has been browsed 5649 times, has been downloaded 759 times.
中文摘要
本論文描述的編譯器最佳化目標是一個向量基底的單一屬性暫存簇。此最佳化結合了暫存器分配與指令排程。它會在程式執行時去檢驗有純量變數出現的地方,並對其盡可能的最佳化。我們的目標,是將具有相似運算的指令給打包起來,並讓我們的最佳化分配器對它進行優化。
儘管其他研究者也有再進行相似的打包方法,但它們大部分研究都被侷限在硬體上面,硬體通常都會將大量時間耗費在純量暫存器與向量暫存器間的資料搬移。而本篇論文與他們不同的是,我們針對新式硬體架構,不需要在不同屬性的暫存器間做資料搬移,更可以利用硬體特性讓一些純量變數並行運算。因此,我們才能夠取得顯著的加速。
最後,我們所考慮的硬體架構,是正在中山大學開發的GPU嵌入式系統。而此GPU架構裡只有單一暫存簇,並可使用此單一暫存簇來對整數、浮點數、向量,來進行儲存及計算。
Abstract
This thesis describes a compiler optimization targeted for machines with unified, vector-based register sets. This optimization combines register allocation and instruction scheduling. It examines places where the code performs computations on scalar variables. The goal is to identify instances where the same operation is performed. For example, a program might calculate “base+offset” and then calculate “i+j”. Even though these computations are unrelated, yet they use the same operator; if “base” and “i” are packed into one vector register, while “offset” and “j” are packed into another, then these two computations can be performed simultaneously through the vectors’ parallel addition operation. This would reduce the execution time of the compiled code.
Although other researchers have considered similar packing methods, their work has been limited by the hardware that they were studying. Such hardware usually imposed high costs for moving data between scalar and vector register banks. This present thesis, however, considers a novel hardware architecture that imposes no such costs. As a consequence, we are able to obtain significant speedups.
The architecture that we consider is a Graphics Processing Unit (GPU) for embedded systems that is under development at this university. This GPU has a single register set for integers, float, and vectors.
目次 Table of Contents
論文審定書…………………………………… i
摘要 …………………………………….……. ii
Abstract ...............…………………………….iii
Index ……………………………………….….v
1. Introduction 1
2. Basic Concepts 8
2.1 Concepts of Compiler 8
2.1.1 SSA-Form 8
2.1.2 Live Variables 9
2.1.3 Trace Scheduling 9
2.2 How Novel Features in Our GPU Affect the Register Allocator 11
3. Related Work 14
4. Implementation 18
4.1 Machine Code Representation Rewriting 20
4.2 Scheduling and Register Allocation Algorithm 23
5. Experimental Results 29
7. Reference 33
8. Appendix 34

參考文獻 References
[1] K. C. Lu and S. Haga. “Compiler Development to Support OpenGL 2.0 ES on a Novel 3D Graphics Processor,” Masters Thesis: National Sun Yat-Sen University, August 2010.
[2] K. A. Huang and S. Haga. “Compiler Support for Vector Processing on OpenGL ES 2.0 Programs,” Masters Thesis: National Sun Yat-Sen University, August 2010.
[3] S. C. Tseng and S. Haga. “Compiler/Hardware Codesign and Memory Management for a Novel 3D Graphics Processor,” Masters Thesis: National Sun Yat-Sen University, August 2010.
[4] The LLVM Compiler. Website: http://llvm.org
[5] Donald E. Knuth,“The Art of Assembly Language, Volume4”, Addison-Wesley, 2006
[6] A. Aho, M. Lam, R. Sethi and J. Ullman, “Compilers: Principles, Techniques and Tools(2nd Ed)”, Pearson Addison Wesley, Hong Kong, 2006.
[7] C. Lattner. “LLVM for OpenGL and other stuff.” LLVM Designers Conference, May 2007
[8] N. Sreraman and R. Govindarajan. “A Vectorizing Compiler for Multimedia Extensions,” The International Journal of Parallel Programming. Vol. 28, No 4, 2000.
[9] H. Chang, and W. Sung, ”Efficient Vectorization of SIMD Programs with Non-
aligned and Irregular Data Access Hardware”, in CASES ’08: Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems, 2008.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code