國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,具指令壓縮機制之VLIW結構之順暢指令流緩衝器設計,Improving the Fetching Performance of Instruction Stream Buffer for VLIW Architectures with Compressed Instructions

論文名稱 Title	具指令壓縮機制之VLIW結構之順暢指令流緩衝器設計 Improving the Fetching Performance of Instruction Stream Buffer for VLIW Architectures with Compressed Instructions
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	94 學年度第 2 學期 The spring semester of Academic Year 94	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	73
研究生 Author	楊凱名 Kai-Ming Yang
指導教授 Advisor	邱日清 Jih-Ching Chiu
召集委員 Convenor	鍾崇斌 Chung-Ping Chung
口試委員 Advisory Committee	蕭勝夫, 李錫智 Shen-Fu Hsiao; Shie-Jue Lee
口試日期 Date of Exam	2006-07-17	繳交日期 Date of Submission	2006-08-25
關鍵字 Keywords	指令流緩衝器、具指令壓縮機制之VLIW 結構 instruction stream buffer, zero overhead looping
統計 Statistics	本論文已被瀏覽 5664 次，被下載 1752 次 The thesis/dissertation has been browsed 5664 times, has been downloaded 1752 times.

中文摘要
在VLIW結構中，因為執行單元的硬體限制與指令間之相依性，需在指令碼中填入大量NOP指令，造成指令碼的高度膨脹，使得儲存指令之記憶體空間嚴重浪費，現今VLIW結構處理器中，皆有指令碼壓縮機制，因此指令碼解壓縮成為必然面對之課題。在DVB-T DSP(數位視訊廣播系統接收器之數位訊號處理器)所規劃的向量化指令下，即可將離散向量視為連續向量執行，此機制建立在零負擔之迴圈模式的軟體管道(soft-pipeline)程式架構，為有效率的執行，維持指令流的順暢是首要條件。除此，分支指令將造成指令流之不連續亦可能破壞指令流的順暢。然指令碼壓縮機制將造成長指令的排列不規則，加重了維持指令流順暢的緩衝機制設計的困難度。本論文主要在實現具順暢指令流緩衝器的設計，有效的保留短程式區段於緩衝器，克服了零負擔迴圈及短跳躍分支指令的影響，達到連續提供指令的抓取需求，並在FIR、FFT、DCT的模擬驗證中得到良好效能提升的結論。
Abstract
Because of the restriction on structure hazard and instruction data dependence, the quantity of NOP instructions fills up a program for VLIW Architectures. This problem causes a waste of program memory, so that an instruction compression mechanism is a must for VLIW Architectures. The vectorized instruction in DVB-T (Digital Video Broadcasting - Terrestrial) DSP will collect the discrete vectors into one continuous vector. This mechanism is based on the software-pipeline of the zero overhead looping mode. It is important to improve the efficiency of instruction fetcher. Additionally, the branch instruction can cause the non-continuous behavior of a program and the damage of the efficiency of instruction fetcher. The mechanism of compressed instructions causes the irregular length of long instruction in fetch packet. The problem becomes difficult designed. The thesis implements a design of improving instruction stream buffer, which can keep the repeat block in buffer. This mechanism overcomes the effects of zero overhead looping and branch instruction. It can also improve the efficiency of continuously fetch instructions. The simulation result shows that the mechanism has a good efficiency in FFT, FIR and DCT.

目次 Table of Contents
摘要......................................................................................................................................I 目錄............................................................................................................. ........................II 圖片列表...........................................................................................................................IV 表格列表..........................................................................................................................VII 第一章研究動機與目的.......................................................................... ..........................1 1.1 VLIW指令壓縮造成指令流中斷......................................................................... 1 1.2 不連續程式行為造成抓取指令的週期損失...................................................... 2 1.3 簡述各章節之內容................................................ ...............................................3 第二章相關研究................................................................................................................5 2.1 VLIW..................................................................................................................... 5 2.2 DVBT-DSP指令集................................................................................................ 6 2.3 DVBT-DSP與Addressing Mode........................................................................... 6 2.3.1 DVBT-DSP硬體架構................................................................................... 6 2.3.2 Register File.................................................................................................. 8 2.3.3 Addressing Mode........................................................................................ 11 2.4 Software Pipeline................................................................................................. 13 第三章順暢指令流緩衝器.............................................................................................. 15 3.1 緩衝器架構......................................................................................................... 16 3.1.1 Circular Buffer與Pre-Fetch Buffer............................................................. 16 3.1.2 控制訊號電路............................................................................................ 17 3.1.3 Long Instruction遮罩與解壓縮電路.......................................................... 18 3.2 解決指令壓縮碼抓取中斷機制......................................................................... 20 3.3 改善Repeat Mode指令抓取機制....................................................................... 24 3.3.1 增進repeat指令流之緩衝器改進.............................................................. 25 3.3.2 Repeat指令................................................................................................ 25 3.3.3 Repeat動作行為…….................................................................................. 26 3.3.4 Repeat控制訊號電路................................................................................. 29 3.4 改善分支指令抓取機制.................................................................................... 31 3.4.1 Branch指令................................................................................................. 31 3.4.2 Branch控制訊號電路................................................................................. 31 3.4.3分支指令跳躍至指令流緩衝器之處理機制與抓取週期之損失.............. 33 3.4.4分支指令跳躍至指令流緩衝器以外的處理機制與抓取週期之損失….. 35 第四章數位訊號運算應用於具有順暢指令流緩衝器.................................................. 39 4.1 FIR....................................................................................................................... 39 4.2 FFT....................................................................................................................... 40 4.3 Fast DCT.............................................................................................................. 43 第五章驗證與模擬分析................................................................................................. 48 5.1 FIR執行效能模擬分析....................................................................................... 48 5.2 FFT執行效能模擬分析....................................................................................... 50 5.3 DCT執行效能模擬分析...................................................................................... 52 5.4 硬體合成結果..................................................................................................... 57 第六章未來展望與結論................................................................................................ 58 參考文獻..........................................................................................................................59

參考文獻 References
[1]. Nat Seshan, “High VelociTI processing [Texas Instruments VLIW DSP Architecture],” IEEE Signal Processing Magazine, Vol. 15, Issue: 2, March 1998, pp 86 -101 [2]. Thomas M.Conte, sanjeev Banerjia, sergei Y.Larin, Kishore N.Menezes and Sumedh W.Sathaye, “Instruction fetch mechanisms for VLIW architectures with compressed encodings,” Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture, 1996, pp. 201 -211 [3]. John L. Hennessy and David A. Patterson, “Computer Architecture A Quantitative Approach 3rd”, Morgan Kaufmann Publichsers,2003 [4]. David A. Patterson and John L. Hennessy, “Computer Organization & Design,” Dartmouth Publishers, 1998 [5]. Deependra Talla, Lizy K. John, Viktor Lapinskii, and Brian L. Evans, “Evaluating signal processing and multimedia applications on SIMD, VLIW and superscalar architectures,” International Conference on Computer Design, 2000, pp. 163 -172. [6]. S.-M.Moon and S.Park, “Performance analysis of VLIW compilation techniques,”IEE Proceedings, Computers and Digital Techniques, Vol. 147, Issue. 2, March 2000, pp. 117 -123 [7]. Chun-Hsien Lee, “Implementation of Vectorization-Based VLIW DSP with Compact Instruction”, Master thesis, National Sun Yat-Sen University, 2004 [8]. G. Bi and E. Jones, “A pipelined FFT processor for word-sequential data,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol.37, December 1989, pp.1982-1985 [9]. Yutai Ma, “An effective memory addressing scheme for FFT processors,” IEEE Transactions on Signal Processing, Vol. 47, No. 3, March 1999, pp.907 - 911 [10]. B. Gold and T. Bially, “Parallelism in fast Fourier transform hardware,” IEEE Transactions on Audio Electroacoustics, Vol. 21, No. 1, Feb 1973, pp. 5-16 [11]. Lee, M.; Tirumalai, P.; Ngai, T.-F., “Software pipelining and suerblock scheduling: compilation techniques for VLIW machines,” Proceeding of the Twenty-Sixth Hawaii International Conference on System Sciences, January 1993, pp.202 -213 [12]. Lar Wanhammar, “DSP Integrated Circuits” Academic press,1999 [13]. Nat Skodras, “Fast Discrete Cosine Transform Pruning,” IEEE Transactions on Signal Processing, Vol. 42, No. 7, July 1994 [14]. Texas Instruments, “TMS320C6000 CPU and Instruction Set Reference Guide ”http://www.ti.com/sc/docs/psheets/rel_dsp.htm

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外完全公開 unrestricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0825106-113139.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS