Responsive image
博碩士論文 etd-0625103-115444 詳細資訊
Title page for etd-0625103-115444
論文名稱
Title
VLIW DSP架構之增進指令並行度之向量化運算機制
Improving ILP with the Vectorized Computing Mechanism in VLIW DSP Architecture
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
84
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2003-06-10
繳交日期
Date of Submission
2003-06-25
關鍵字
Keywords
指令並行度、向量運算
VLIW, vector computing, instruction level parallelism
統計
Statistics
本論文已被瀏覽 5702 次,被下載 3317
The thesis/dissertation has been browsed 5702 times, has been downloaded 3317 times.
中文摘要
現今的DSP處理器設計常利用VLIW架構提高指令執行之並行度,以達到提高效能的目的。提高指令並行度的瓶頸有二,一是硬體資源是否足以同時處理所有的平行指令,二是由於指令間的相依關係所以無法平行處理;本論文針對FFT演算法設計了一個VLIW架構之運算核心DVBTDSP,並利用軟體排程(Software pipelining)的方式將指令迴圈重新排程以達到在處理FFT之蝴蝶運算時具有最佳之指令並行度,另外為了能提供順暢的資料流,本論文針對FFT向量運算之特性,改良傳統DSP的餘數定址(modulo addressing)之運算機制,使得原本離散的向量能被視為一新的連續向量,避免了因向量中斷所造成的管線延遲,根據模擬分析的結果,此架構在處理FFT運算時跟C6200相比只需要其1/2的運算時間,在做其他演算法如FIR,IIR,DCT也有不亞於C6200的效能。
Abstract
In order to improving the performance for real-time application, current digital signal processors use VLIW architectures to increase the degree of instruction level parallelism (ILP). Two factors will limit the ILP, one is enough hardware resource for all parallel instructions. Another is the dependence relations between instructions. This thesis designs a VLIW architecture processing core called DVBTDSP molded by FFT algorithm and uses the software pipelining mechanism to schedule the loop to achieve the highest ILP degree when used to execute FFT butterfly operations. Furthermore, in order to provide the smooth data stream for pipeline operations, we design a mechanism to improve the modulo addressing, which will collect the discrete vectors into one continuous vector. The simulation results show that the DVBTDSP has double performance of the C6200 for the FFT processing, and has good performance for FIR, IIR and DCT algorithm computing.
目次 Table of Contents
摘要 i
ABSTRACT ii
Contents iii
List of Figures v
List of Tables vii
Chapter 1 Introduction 1
1.1 The Development of DSP and Vector Processors 3
1.2 Standard DSP Architecture 4
1.3 Motivation and Goal 6
Chapter 2 Survey 8
2.1 VLIW 8
2.2 Basic Compiler ILP 9
2.3 Vector Processors 13
2.4 Current DSP processor with vector computing (VFP, C3x, C6x) 14
Chapter 3 Design of an Instruction Pipeline Decoder 20
3.1 The Characteristics of Arm Introduction Set 21
3.1.1 Instruction types 21
3.1.2 Multi-cycle instruction 22
3.1.3. Instruction stream 22
3.1.4. Forwarding controller 25
3.2 A Single Instruction Pipeline Decoder Design 26
3.2.1 Architecture 26
3.2.2 Resolution unit 28
3.3 Decoder design in VLIW DSP architecture 29
Chapter 4 Vectorized computing algorithm in VLIW architecture 31
4.1 FFT algorithm with DSP processing 31
4.2 Vectorized code scheduling 36
4.3 Circular Index Register setting instructions 39
4.4 Conditional load instruction 40
4.5 Modulo addressing mode 41
4.6 The Architecture of DVBTDSP 46
4.7 Super Element Architecture 48
4.7.1 ALUL 50
4.7.2 ALUR & MUL 51
4.7.3 Load 53
4.7.4 Store 55
4.7.5 Register File 56
Chapter 5 Verification and Analysis result 59
5.1 Verification environment 61
5.2 Synthesis results 62
5.3 Analysis results 65
Chapter 6 Conclusions and Future Work 72
Appendix 74
Reference 82
參考文獻 References
[1] Sunghyun Jee; Palaniappan, K, ”Dynamically scheduling VLIW instructions with dependency information” Interaction between Compilers and Computer Architectures, 2002, pp15-23
[2] J W Cooley and J W Tukey: “An Algorithm for the Machine Computation of Complex Fourier Series”, Mathematical Computations, 19, April 1965, pp. 297-301
[3] Lars Wanhammar, DSP Integrateed Circuits, academic press, 1999.
[4] Glasser L.A and Dobberpuhl D.W, “The Design and Analysis of VLSI Circuits”, Addison-Wesley, Reading, MA, 1985
[5] Gene Frantz, “Digital Signal Processor Trends“, IEEE Micro,
November-December 2000 pp 52-59 November/December 2000 (Vol. 20, No. 6)
[6] Wolfe, A.; Fritts, J.; Dutta, S.; Fernandes, E.S.T.,” Datapath design for a VLIW video signal processor”, High-Performance Computer Architecture, 1997., Third International Symposium on , pp24 -35, 1-5 Feb 1997
[7] Sunghyun Jee; Palaniappan, K. “Compiler processor tradeoffs for DISVLIW architecture”, International Symposium on Parallel Architectures, Algorithms and Networks, pp: 175 -180. 2002
[8] J. Fritts. Architecture and Compiler Design Issues in Programmable
Media Processors, Ph.D. Thesis, 2000.
[9] D. A. Patterson and J. L. Hennessy, “Computer Atchitecture a Quantitative Approach”, Third Edition, Morgan Kaufmann Publisher, 2003
[10] Calahan, D.; Ames, W., ”Vector processors: Models and applications”, Circuits and Systems, IEEE Transactions on, pp715-726, Volume: 26 Issue: 9 , Sep 1979
[11] Kai Hwang, Faye A. Briggs, “Computer Architecture and Parallel Processing”,McGraw-Hill Book Company,1984
[12] Texas Instruments, ”TMS320C3X User's Guide”, http://www.ti.com/sc/docs/psheets/rel_dsp.htm
[13] Texas Instruments, “TMS320C6000 CPU and Instruction Set Reference Guide”, http://www.ti.com/sc/docs/psheets/rel_dsp.htm
[14] J. Eyre, J. Bier, "DSP Processors hits the mainstream" Computer Magazine,
pp. 51-59, August 1998.
[15] ARM,”VFP9-S Vector Floating-point Coprocessor Technique Reference Manual”, http://www.arm.com
[16] ARM,”Arm Architecture Reference Manual”, http://www.arm.com
[17] Simon Segars, ”The ARM9 Family – High performance Microprocessors for Embedded Applications” Computer Design: VLSI in Computers and Processors, 1998. ICCD '98. Proceedings. International Conference, pp:230-235,1998
[18] Steve Fuber, “ARM System-on-Chip Architecture” Addison Wesley Longman Inc,1996.
[19] Findlay, P.A.; Trainis, S.A.; Steven, G.B.; Adams, R.G.,” HARP: a VLIW RISC processor”, CompEuro '91. 'Advanced Computer Technology, Reliable Systems and Applications'. 5th Annual European Computer Conference. Proceedings. , pp368 -372, 13-16 May 1991
[20] Lee, M.; Tirumalai, P.; Ngai, T.-F., “Software pipelining and superblock scheduling: compilation techniques for VLIW machines,” Proceeding of the Twenty-Sixth Hawaii International Conference on System Sciences, pp 202 -213, 5-8 Jan 1993.
[21] Bogong Su; Jian Wang; Zhizhong Tang; Wei Zhao; Yimin Wu; A Software sPipelining Based VLIW Architecture and Optimizing Compiler Microprogramming and Microarchitecture. Micro 23. Proceedings of the 23rd Annual Workshop and Symposium, Workshop on, pp17-27, 27-29, Nov 1990
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code