國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,可用於三維圖形運算之低功率多重精確度功能單元產生器,Low-power Multi-precision Functional Unit Generator for 3-D Graphics Application

論文名稱 Title	可用於三維圖形運算之低功率多重精確度功能單元產生器 Low-power Multi-precision Functional Unit Generator for 3-D Graphics Application
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	102 學年度第 2 學期 The spring semester of Academic Year 102	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	88
研究生 Author	林柏廷 Bo-ting Lin
指導教授 Advisor	鄺獻榮 Shiann-Rong Kuang
召集委員 Convenor	陳培殷 Pei-Yin Chen
口試委員 Advisory Committee	陳仁德, 蕭宇宏 Ren-Der Chen; Yu-Hung Hsiao
口試日期 Date of Exam	2014-07-24	繳交日期 Date of Submission	2014-08-18
關鍵字 Keywords	低功率、多重精確度函數插補器、多重模式浮點乘加器、產生器 multi-mode floating point multiply-add-fused, low power, generator, multi-precision function interpolator
統計 Statistics	本論文已被瀏覽 5677 次，被下載 52 次 The thesis/dissertation has been browsed 5677 times, has been downloaded 52 times.

中文摘要
本論文提出一個可以產生符合IEEE-754單精度浮點數標準的多重精確度函數插補器產生器與多重精確度浮點乘加器產生器，使用者可以依需求選擇產生具有多重精確度的硬體架構。函數插補器可以執行倒數、倒數開根號、對數與指數運算，浮點乘加器可以執行乘法、加法與乘累加法運算，每種運算可以使用不同精確度模式運算。其硬體架構以管線化方式設計，以符合數位訊號處理器(DSP)、圖形處理器(GPU)之硬體架構特性。函數插補器是基於查表法所設計，透過運算二次多項式求得目標函數的近似值，其中二次多項式的係數是採用多區間極大極小近似法求得。浮點乘加器將浮點乘法和浮點加法組合為一個單元執行乘累加，當乘法運算時，加法的小數點對齊動作也會平行運算。多重精確度函數插補器與浮點乘加器乘法運算除了最高精確度模式外，可以根據部分積累加情況，執行低精確度模式；此模式之下，系統會關閉不累加的部分積位元硬體，而浮點乘加器的加法運算，則會關閉低權重位元硬體。使用者可以選擇各種需要的精確度模式，產生器可以自動產生其硬體架構與Verilog code，並確保不同精確度間的硬體不會彼此衝突，並且各種運算都能符合所需精確度要求。當需要產生非最高精確度的硬體架構，產生器會針對不同精確度加入時脈閘控與拴鎖器，當執行這些精確度運算時，加入的控制開關就會將不必要運作的元件關閉，藉此減少非最高精確度模式運算功率消耗。函數插補器執行四種運算其中一種時，只會查詢該運算的表格獲得二次多項式係數。因此可以加上栓鎖器當開關，減少其餘三種運算表格的動態功率消耗。這樣一來，即使進行最高精確度運算時，也可以降低原本的功率消耗。浮點乘加器單獨執行乘法運算時，會加上栓鎖器關閉加法的部分硬體；單獨執行加法運算時，關閉乘法器，減少單獨運算沒使用的硬體功率消耗。藉由上述多重精確度函數差補器產生器與多重精確度浮點乘加器產生器，可以根據使用者選擇產生所需要的不同精確度硬體架構，並且在能夠容許的誤差範圍內執行較低精確度的運算，以減少功率消耗並延長裝置的使用時間。
Abstract
A multi-precision function interpolator generator and a multi-precision MAF generator, which is compliant in with the IEEE-754 single precision floating point standard, is proposed in this paper. Users can generate different hardware architecture with multi-precision according to their requirement. Function interpolator provides logarithms, exponentials, reciprocal and square root reciprocal operations. On the other hand, MAF provides multiplication, addition, and multiply-accumulation, and each operation can be calculated in different precisions. The hardware architecture is designed with full pipeline in order to comply with hardware architectures of general digital signal processors (DSPs) and graphic processors (GPUs). This function interpolator is designed based on the look-up table method. It can get the approximation value of target function through the calculation of quadratic polynomial. MAF combines floating-point multiplication and accumulation into one single unit to execute multiply-accumulation operation. When executing multiplication, it will align the decimal point in addition process at the same time. Multi-precision function interpolator and MAF not only have the highest precision mode, they also can execute low precision modes. In low precision mode, system will shut down partial product bits hardware components that are not being used. Users can choose different types of the precision levels needed, and generators will automatically create the hardware architectures and Verilog codes. Different hardware for achieving different precision modes would not conflict with each other, and all operations will meet the precision requirement. When generating the hardware architecture without the highest precision level, the generator will add the clock gating cells and latches for different precision modes. When producing these approximation values, the switches added will shut down the unnecessary components in order to reduce the power consumption. Executing one of the four functions in the function interpolator will only search for its own calculation’s table to find the coefficients of quadratic polynomial. Therefore, the latch can be added as switches to reduce dynamic power consumption of tables for the other three functions. Thus, even when executing in the highest precision level, the power consumption can also be reduced. When MAF only performing multiplications, latches are added to shut down parts of the accumulation hardware. On the contrary, when it only performing accumulations, parts of the multiplication hardware are shutdown to reduce the power consumption. As mentioned above, the multi-precision function interpolator generator and the multi-precision MAF generator can generate different hardware architectures with different precision modes for different requirement to reduce the power consumption and extend the battery’s lifetime of the device.

目次 Table of Contents
論文審定書 i 論文提要 ii 誌謝 iii 摘要 iv Abstract vi 第一章概論 1 1.1 研究動機 1 1.2 論文大綱 2 第二章研究背景 3 2.1 IEEE-754單精度浮點數標準 3 2.2 函數插補器 4 2.2.1 多項式逼近法 4 2.2.2 二次多項式的函數插補器架構 6 2.3 傳統函數插補器 7 2.3.1 多項式係數產生 9 2.3.2 平方器 10 2.3.3 布斯編碼器與部分積分選擇器 12 2.3.4 壓縮樹 14 2.4 浮點乘加器 17 2.4.1 浮點乘法與加法原理 17 2.4.2 傳統浮點乘加器 18 2.4.3 乘法器 20 2.4.4 移位器 22 2.4.5 捨進 23 第三章函數插補器產生器 24 3.1 基礎的函數插補器 24 3.1.1 基礎的函數插補器部分積排列 24 3.1.2 基礎的函數插補器架構 25 3.2 多重精確度函數插補器產生器實做 26 3.2.1 以「列」為基礎切割 26 3.2.2 以「欄」為基礎切割 33 3.2.3 多重精確度函數插補器產生器實現 35 第四章浮點乘加器產生器 48 4.1 基礎的浮點乘加器 48 4.2 多重精確度浮點乘加器產生器實做 49 4.2.1 多重精確度乘法器 49 4.2.2 多重精確度加法器 51 4.2.3 多重精確度浮點乘加器產生器實現 52 4.2.4 多重精確度浮點乘加器之運算誤差 53 第五章實驗結果 55 5.1 實驗步驟與方法 55 5.2 函數插補器驗證與數據比較 56 5.3 浮點乘法器驗證與數據比較 69 第六章結論與未來研究方向 72 6.1 結論 72 6.2 未來研究方向 72 參考文獻 73

參考文獻 References
[1] “IEEE Standard for Floating-Point Arithmetic,” 2008. [2] J.A. Pineiro, S.F. Oberman, J.-M. Muller and J.D. Bruguera, “High-speed function approximation using a minimax quadratic interpolator,” IEEE Transactions on Computers, vol. 54, no. 3, pp. 304-318, 2005. [3] 程建綱, “適用於多媒體應用的多重精確度函數插補器”,國立中山大學資訊工程學系碩士論文, 2012. [4] M.J. Schulte and K.E. Wires, “High-speed inverse square roots, ” IEEE 14th Symp. Computer Arithmetic, pp. 124-131, 1999 [5] R.H. Strandberg, L.G. Bustamante, V.G. Oklobdzija, M.A. Soderstrand and Jean-Claude Duc, “Efficient realizations of squaring circuit and reciprocal used in adaptive sample rate notch filters,” Journal of VLSI Signal Processing, vol. 14, no. 3, pp. 303-309, 1996. [6] P. Bonatto and V.G. Oklobdzija, “Evaluation of Booth's algorithm for implementation in parallel multipliers,” IEEE Conference on Signals, Systems and Computers (ASILOMAR-29), vol. 1, pp. 608-610, 1996. [7] Zhijun Huang, “High-level optimization techniques for low-power multiplier design,” PhD dissertation, Univ. of California, Los Angeles, 2003. [8] 余其坤, “適用於低功率應用的多重模式浮點乘加器” 國立中山大學資訊工程學系碩士論文, 2011. [9] Kun-Yi Wu, Chih-Yuan Liang, Kee-Khuan Yu, and Shiann-Rong Kuang, “Multiple-mode floating-point multiply-add fused unit for trading accuracy with power consumption”, IEEE International Conference on Computer and Information Science, pp. 429-435, 2013. [10] 姬瑋忠, “適用於三維圖形處理器之低功率特殊函數指令精確度分配系統”, 國立中山大學資訊工程學系碩士論文, 2013. [11] Wen-Chang Yeh and Chein-Wei Jen, “A high performance carry-save to signed-digit recoder for fused addition-multiplication,” IEEE ICASSP, vol. 6, pp. 3259-3262, 2000. [12] Kucukkabak, U. and Akkas, A. , “Design and implementation of reciprocal unit using table look-up and Newton-Raphson iteration,” Euromicro Symposium on Digital System Design , pp. 249-253, 2004. [13] Erez, S. and Even, G. , “An improved micro-architecture for function approximation using piecewise quadratic interpolation,” IEEE International Conference on Computer Design, pp. 422-426, 2008. [14] B. Nam, H. Kim and H. Yoo, “A low-power unified arithmetic unit for programmable handheld 3-D graphics system,” IEEE J. Solid-State Circuits, vol. 42, no. 8, pp.1767 -1778, 2007. [15] Shen-Fu Hsiao, Chan-Feng Chiu and Chia-Sheng Wen, “Design of a low-cost floating-point programmable vertex processor for mobile graphics applications based on hybrid number system,” IEEE International Conference on IC Design & Technology, pp.1-4, 2011. [16] Sameh Galal, Ofer Shacham, John S. Brunhaver II, Jing Pu, Artem Vassiliev, and Mark Horowitz, “FPU generator for design space exploration,” IEEE Symposium on Computer Arithmetic, pp.25-34, 2013.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0716114-030526.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS