Responsive image
博碩士論文 etd-0825106-113139 詳細資訊
Title page for etd-0825106-113139
論文名稱
Title
具指令壓縮機制之VLIW結構之順暢指令流緩衝器設計
Improving the Fetching Performance of Instruction Stream Buffer for VLIW Architectures with Compressed Instructions
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
73
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2006-07-17
繳交日期
Date of Submission
2006-08-25
關鍵字
Keywords
指令流緩衝器、具指令壓縮機制之VLIW 結構
instruction stream buffer, zero overhead looping
統計
Statistics
本論文已被瀏覽 5664 次,被下載 1752
The thesis/dissertation has been browsed 5664 times, has been downloaded 1752 times.
中文摘要
在VLIW結構中,因為執行單元的硬體限制與指令間之相依性,需在指令碼中填入大量NOP指令,造成指令碼的高度膨脹,使得儲存指令之記憶體空間嚴重浪費,現今VLIW結構處理器中,皆有指令碼壓縮機制,因此指令碼解壓縮成為必然面對之課題。在DVB-T DSP(數位視訊廣播系統接收器之數位訊號處理器)所規劃的向量化指令下,即可將離散向量視為連續向量執行,此機制建立在零負擔之迴圈模式的軟體管道(soft-pipeline)程式架構,為有效率的執行,維持指令流的順暢是首要條件。除此,分支指令將造成指令流之不連續亦可能破壞指令流的順暢。然指令碼壓縮機制將造成長指令的排列不規則,加重了維持指令流順暢的緩衝機制設計的困難度。本論文主要在實現具順暢指令流緩衝器的設計,有效的保留短程式區段於緩衝器,克服了零負擔迴圈及短跳躍分支指令的影響,達到連續提供指令的抓取需求,並在FIR、FFT、DCT的模擬驗證中得到良好效能提升的結論。
Abstract
Because of the restriction on structure hazard and instruction data dependence, the quantity of NOP instructions fills up a program for VLIW Architectures. This problem causes a waste of program memory, so that an instruction compression mechanism is a must for VLIW Architectures. The vectorized instruction in DVB-T (Digital Video Broadcasting - Terrestrial) DSP will collect the discrete vectors into one continuous vector. This mechanism is based on the software-pipeline of the zero overhead looping mode. It is important to improve the efficiency of instruction fetcher. Additionally, the branch instruction can cause the non-continuous behavior of a program and the damage of the efficiency of instruction fetcher. The mechanism of compressed instructions causes the irregular length of long instruction in fetch packet. The problem becomes difficult designed. The thesis implements a design of improving instruction stream buffer, which can keep the repeat block in buffer. This mechanism overcomes the effects of zero overhead looping and branch instruction. It can also improve the efficiency of continuously fetch instructions. The simulation result shows that the mechanism has a good efficiency in FFT, FIR and DCT.
目次 Table of Contents
摘要......................................................................................................................................I
目錄............................................................................................................. ........................II
圖片列表...........................................................................................................................IV
表格列表..........................................................................................................................VII
第一章 研究動機與目的.......................................................................... ..........................1
1.1 VLIW指令壓縮造成指令流中斷......................................................................... 1
1.2 不連續程式行為造成抓取指令的週期損失...................................................... 2
1.3 簡述各章節之內容................................................ ...............................................3
第二章 相關研究................................................................................................................5
2.1 VLIW..................................................................................................................... 5
2.2 DVBT-DSP指令集................................................................................................ 6
2.3 DVBT-DSP與Addressing Mode........................................................................... 6
2.3.1 DVBT-DSP硬體架構................................................................................... 6
2.3.2 Register File.................................................................................................. 8
2.3.3 Addressing Mode........................................................................................ 11
2.4 Software Pipeline................................................................................................. 13
第三章 順暢指令流緩衝器.............................................................................................. 15
3.1 緩衝器架構......................................................................................................... 16
3.1.1 Circular Buffer與Pre-Fetch Buffer............................................................. 16
3.1.2 控制訊號電路............................................................................................ 17
3.1.3 Long Instruction遮罩與解壓縮電路.......................................................... 18
3.2 解決指令壓縮碼抓取中斷機制......................................................................... 20
3.3 改善Repeat Mode指令抓取機制....................................................................... 24
3.3.1 增進repeat指令流之緩衝器改進.............................................................. 25
3.3.2 Repeat指令................................................................................................ 25
3.3.3 Repeat動作行為…….................................................................................. 26
3.3.4 Repeat控制訊號電路................................................................................. 29
3.4 改善分支指令抓取機制.................................................................................... 31
3.4.1 Branch指令................................................................................................. 31
3.4.2 Branch控制訊號電路................................................................................. 31
3.4.3分支指令跳躍至指令流緩衝器之處理機制與抓取週期之損失.............. 33
3.4.4分支指令跳躍至指令流緩衝器以外的處理機制與抓取週期之損失….. 35
第四章 數位訊號運算應用於具有順暢指令流緩衝器.................................................. 39
4.1 FIR....................................................................................................................... 39
4.2 FFT....................................................................................................................... 40
4.3 Fast DCT.............................................................................................................. 43
第五章 驗證與模擬分析................................................................................................. 48
5.1 FIR執行效能模擬分析....................................................................................... 48
5.2 FFT執行效能模擬分析....................................................................................... 50
5.3 DCT執行效能模擬分析...................................................................................... 52
5.4 硬體合成結果..................................................................................................... 57
第六章 未來展望與結論................................................................................................ 58
參考文獻..........................................................................................................................59
參考文獻 References
[1]. Nat Seshan, “High VelociTI processing [Texas Instruments VLIW DSP Architecture],” IEEE Signal Processing Magazine, Vol. 15, Issue: 2, March 1998, pp 86 -101
[2]. Thomas M.Conte, sanjeev Banerjia, sergei Y.Larin, Kishore N.Menezes and Sumedh W.Sathaye, “Instruction fetch mechanisms for VLIW architectures with compressed encodings,” Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture, 1996, pp. 201 -211
[3]. John L. Hennessy and David A. Patterson, “Computer Architecture A Quantitative Approach 3rd”, Morgan Kaufmann Publichsers,2003
[4]. David A. Patterson and John L. Hennessy, “Computer Organization & Design,” Dartmouth Publishers, 1998
[5]. Deependra Talla, Lizy K. John, Viktor Lapinskii, and Brian L. Evans, “Evaluating signal processing and multimedia applications on SIMD, VLIW and superscalar architectures,” International Conference on Computer Design, 2000, pp. 163 -172.
[6]. S.-M.Moon and S.Park, “Performance analysis of VLIW compilation techniques,”IEE Proceedings, Computers and Digital Techniques, Vol. 147, Issue. 2, March 2000, pp. 117 -123
[7]. Chun-Hsien Lee, “Implementation of Vectorization-Based VLIW DSP with Compact Instruction”, Master thesis, National Sun Yat-Sen University, 2004
[8]. G. Bi and E. Jones, “A pipelined FFT processor for word-sequential data,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol.37, December 1989, pp.1982-1985
[9]. Yutai Ma, “An effective memory addressing scheme for FFT processors,” IEEE Transactions on Signal Processing, Vol. 47, No. 3, March 1999, pp.907 - 911
[10]. B. Gold and T. Bially, “Parallelism in fast Fourier transform hardware,” IEEE Transactions on Audio Electroacoustics, Vol. 21, No. 1, Feb 1973, pp. 5-16
[11]. Lee, M.; Tirumalai, P.; Ngai, T.-F., “Software pipelining and suerblock scheduling: compilation techniques for VLIW machines,” Proceeding of the Twenty-Sixth Hawaii International Conference on System Sciences, January 1993, pp.202 -213
[12]. Lar Wanhammar, “DSP Integrated Circuits” Academic press,1999
[13]. Nat Skodras, “Fast Discrete Cosine Transform Pruning,” IEEE Transactions on Signal Processing, Vol. 42, No. 7, July 1994
[14]. Texas Instruments, “TMS320C6000 CPU and Instruction Set Reference Guide
”http://www.ti.com/sc/docs/psheets/rel_dsp.htm
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code