國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,適用於低功率應用的多重模式浮點乘加器 ,Multi-Mode Floating-Point Multiply-Add Fused Unit for Low-Power Applications

論文名稱 Title	適用於低功率應用的多重模式浮點乘加器 Multi-Mode Floating-Point Multiply-Add Fused Unit for Low-Power Applications
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	99 學年度第 2 學期 The spring semester of Academic Year 99	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	61
研究生 Author	余其坤 Kee-khuan Yu
指導教授 Advisor	鄺獻榮 Shiann-Rong Kuang
召集委員 Convenor	蕭勝夫 Shen-Fu Hsiao
口試委員 Advisory Committee	張雲南, 陳銘志, 郭可驥 Yun-Nan Chang; Ming-Chih Chen; Ko-Chi Kuo
口試日期 Date of Exam	2011-07-26	繳交日期 Date of Submission	2011-08-01
關鍵字 Keywords	多重模式浮點乘加器、重複式乘法、低功率、截斷式加法 iterative multiplication, low power, truncated addition, multi-mode floating point multiply-add-fused
統計 Statistics	本論文已被瀏覽 5707 次，被下載 1016 次 The thesis/dissertation has been browsed 5707 times, has been downloaded 1016 times.

中文摘要
在數位訊號處理或計算圖形的系統中，浮點乘法與浮點加法為最常使用的運算，且每次浮點乘法運算後緊接著浮點加法運算的頻率很高，因此為了達成高效能與低成本的目標，通常將浮點乘法與浮點加法合併為一個單元以執行浮點乘累加，稱為浮點數乘加器( Floating-Point Multiply-Add Fused, MAF)。現今行動裝置的發展越來越蓬勃，效能以及電源持久性的要求成為主要發展趨勢，因此低功率的機制和技術越來越受重視。因此，我們提出一種多重模式浮點乘加器，以重複式乘法(Iterative Multiplication)和截斷式加法(Truncated Addition)之方式，設計出具有多種誤差運算模式的浮點乘加器。此乘加器共有七種誤差模式，其中浮點乘累加運算有三種誤差模式、單獨浮點乘或單獨浮點加運算各有二種誤差模式。在浮點乘累加運算中，提供使用者三種誤差模式，分別產生0%、0.328%和1.107%之運算誤差，而其中的0%誤差為IEEE754單精度浮點乘累加運算。另外在浮點乘法和浮點加法運算中，各有兩種誤差模式讓使用者選擇，分別對浮點乘法產生0.328%、0%之誤差以及為浮點加法引入0.781%、0%之誤差，而其中的0%誤差為IEEE754單精度浮點運算。本論文所提出的多重模式浮點乘加器架構與IEEE754單精度浮點乘加器比較，電路面積減少了5%而電路延遲增加23%，如此即可達到多重模式的效果。在功率消耗方面，使用本論文所提出之浮點乘加器並且在允許\\\\\\誤差的模式狀況下，執行圖片格式轉換之RGB轉YUV應用程式時均能達到降低功率消耗的效果。
Abstract
In digital signal processing and multimedia applications, floating-point(FP) multiplication and addition are the most commonly used operations. In addition, FP multiplication operations are frequently followed by the FP addition operations. Therefore, in order to achieve high performance and low cost, multiplication and addition are usually combined into a single unit, known as the FP Multiply-Add Fused (MAF). On the other hand, the mobile devices nowadays are rapidly developing. For this kind of devices, performance and power sustainability have to become the major trend in the research area. As a result, the mechanisms to reduce energy consumption become more important. Therefore, we propose a multi-mode FP MAF based on the concept of iterative multiplication and truncated addition, to achieve different operating modes with different errors. This MAF, with a total of seven modes, includes three modes for the FP multiply-accumulate operations, two modes for single FP multiplication operation and single FP addition operation, respectively. FP multiply-accumulate operations provide three modes to user, and this three modes have 0%, 0.328% and 1.107% of error. The 0% error is the same with the standard IEEE754 single-precision FP Multiply-Add Fused operations. For FP multiplication and FP addition operations, the proposed MAF allows users to choose two kinds of error modes, which are 0%, 0.328% error for FP multiplication and 0%, 0.781% error for FP addition. The 0% error is the same with the standard IEEE754 single-precision floating-point operations. When compared with the standard IEEE754 single-precision FP MAF, the proposed multi-mode FP MAF architecture has 4.5% less area and increase about 22% delay to achieve the effect of multi-mode. To demonstrate the power efficiency of proposed FP MAF, it is used to perform the operations of FP MAF, FP multiplication, and FP addition in the application of RGB to YUV format conversion. Experimental results show that, the proposed multi-mode FP MAF can significantly reduce power consumption when the modes with error are adopted.

目次 Table of Contents
論文目次 (Table of Contents) Chapter 1. 緒論 1 1.1 研究動機 1 1.2 論文大綱 2 Chapter 2. 研究背景與相關研究 3 2.1 IEEE 754 規格簡介 3 2.2 浮點數加法原理 4 2.3 浮點數乘法原理 6 2.4 布斯乘法器簡介 7 2.5 壓縮樹 11 2.6 捨進 14 2.7 傳統浮點乘加器架構 18 Chapter 3. 提出的多重模式浮點乘加器 23 3.1 簡介 23 3.2 多重模式浮點乘加器架構 24 3.3 改良重複式浮點乘法器 26 3.4 改良重複式浮點乘法器之誤差 31 3.5 截斷式加法 32 3.6 加法運算之誤差 36 3.7 特殊浮點乘與浮點加運算 37 3.8 多重模式浮點乘加器之誤差 39 3.9 控制電路 40 Chapter 4. 實驗結果及比較 42 Chapter 5. 結論與未來研究方向 47 5.1 結論 47 5.2 未來研究方向 47 參考文獻 48

參考文獻 References
[1] Y. Voronenko and M. Pushel, “Automatic Generation of Implementations for DSP Transforms on Fused Multiply-Add Architectures,” International Conf. on Acoustics, Speech and Signal Processing, pp. V-101-V-104, 2004. [2] E. N. Linzer, “Implementation of Efficient FFT Algorithms on Fused Multiply-Add Architectures,” IEEE Transactions on Signal Processing, Vol. 41, pp. 93-107, 1993. [3] H. H. Yao, E. E. Swartzlander, Jr., “Serial-parallel multiplier,” IEEE conf. on Signals, Systems and Computers, vol.1, pp. 359-363,1993. [4] H. I. Saleh, A. H. Khalil, M. A. Ashour, A. E. Salama, “Novel serial-parallel multiplier,” IEEE Journal on Circuits, Devices and Systems, Vol. 148, pp. 183-189, 2001. [5] Lan-Da Van, Shuenn-Shyang Wang, Tenqchen Shing, Wu-Shiung Feng, and Bor-Shenn Jeng,“Design of a lower-error fixed-width multiplier for speech processing application,” IEEE Conf. on Circuits and Systems, Vol.3, pp. 130-133, 1999. [6] Booth, A. -D.,“A Signed Binary Multiplication Technique,” Quartery J. Mechanical Application in Math., vol. 4, part2, pp. 236-240, 1951. [7] C S. Wallace, “A suggestion for a fast multiplier,” IEEE Transactions On Electronic Computers, Vol. 13, pp. 14-17, 1964. [8] L. Dadda, “Some Schemes for Parallel Multipliers,” Alta Frequenza, Vol. 34, pp. 349-356, 1965. [9] B. S. Cherkauer, E. G. Friedman, “A hybrid radix-4/radix-8 low power signed multiplier design,” IEEE Transactions on Computers, Vol. 54, no. 3, March 2005. [10] 李朝民,“左到右陣列乘法器之分析與比較”, 南台科技大學電子工程研究所碩士論文, 民95年. [11] ANSI/IEEE standard 754-1985, IEEE Standard for Binary Floating-Point Arithmetic, 1985. [12] E. Hokenek, R. Montoye, and P. W. Cook, “Second-Generation RISC Floating Point with Multiply-Add Fused,” IEEE Journal of Solid-State Circuits, Vol. 25, pp. 1207-1213, Oct. 1990. [13] Martin S. Schmookler, Kevin J. Nowka, “Leading Zero Anticipation and Detection A Comparison of Methods,” ARITH-15, pp. 7, June 11-13, 2001. [14] P. BONATTO, V. G. OKLOBDZIJA, “Evaluation of Booth's Algorithm for Implementation In Parallel Multipliers,” IEEE ASILOMAR-29, pp. 608-610, 1996. [15] R. E. Ladner and M. J. Fischer, “Parallel prefix computation,” J. Ass. Comput. Mach., Vol. 27, pp. 831-838, Oct. 1980. [16] 郭倉源, “適用於多媒體應用的低功率多重精確度重複式浮點乘法器”, 國立中山大學資訊工程學系碩士論文, 民99年. [17] R. K. Montoye, E. Hokenek and S. L. Runyon, “Design of the IBM RISC System/6000 floating-point execution unit,” IBM Journal of Research & Development, Vol. 34, pp. 59-70, 1990. [18] A. Kumar, “The HP PA-8000 RISC CPU,” IEEE Micro Magazine, Vol. 17, Issue 2, pp. 27-32, April, 1997. [19] D. Hunt, “Advanced Performance Features of the 64-bit PA-8000,” Proceedings of Compcon, pp. 123-128, 1995. [20] K. C. Yeager, “The MIPS R10000 superscalar microprocessor,” IEEE Micro Magazine, Vol. 16, no. 2, pp. 28-40, March, 1996. [21] B. Greer, J. Harrison, G. Henry, W. Li and P. Tang, “Scientific Computing on the Itanium Processor,” ACM/IEEE Conf. on Supercomputing, pp. 1-8, 2001. [22] H. Sharangpani and K. Arora. “Itanium Processor Microarchitecture,” IEEE Micro Magazine, Vol. 20, no. 5, pp. 24-43, Sept-Oct, 2000. [23] Zhijun Huang, “High-Level Optimization Techniques for Low-Power Multiplier Design,” PhD dissertation, Univ. of California, Los Angeles, June, 2003. [24] D. Radhakrishman and A. P. Preethy, “Low-power CMOS pass logic 4-2 compressor for high-speed multiplication,” 43rd IEEE Midwest Symp. on Circuits & Systems, Vol. 3, pp. 1296-1298, 2000. [25] P. Mokrian, G. Howard, G. Jullien, and M. ahmadi, “On the use of 4:2 compressor for partial product reduction,” IEEE Canadian Conf. on Electrical and Computer Engineering, Vol. 1, pp. 121-124, May, 2003. [26] D. Villeger and V. Oklobdzija, “Analysis of Booth Encoding Efficiency in Parallel Multipliers Using Compressors for Reduction of Partial Products,” 27th Ann. Asilomar Conf. on Signals, Systems, and Computers, Vol. 1, pp. 781-784, 1993. [27] R.V.K. Pillai, S.Y.A. Shah, A.J. Al-Khalili, and D. Al-Khalili, “Low Power Floating-Point MAFs – A Comparative Study”, sixth International Symp. on Signal Processing and its Applications, Vol. 1, pp. 284-287, August, 2001. [28] Eric Quinnell, Earl E. Swartzlander, Jr., Carl Lemonds, “Floating-Point Fused Multiply-Add Architectural”, Forty-First Asilomar Conf. on Signals, Systems and Computers, 2007. [29] Libo Huang, Li Shen, Kui Dai, Zhiying Wang, “A New Architecture For Multiple-Precision Floating-Point Multiply-Add Fused Unit Design”, IEEE symp. on Computer Arithmetic, June 2007. [30] Zichu Qi, Qi Guo, Ge Zhang, Xiangku Li, Weiwu Hu, “Design of Low-Cost High-performance Floating-point Fused Multiply-Add with Reduced Power”, 23rd International Conference on VLSI Design, Jan. 2010. [31] Eric Quinnell, Earl E. Swartzlander Jr., Carl Lemonds, “Bridge Floating-Point Fused Multiply-Add Design”, IEEE Transactions on VLSI Systems, Dec. 2008. [32] “TSMC 0.13μm (CL013G) process 1.2-Volt SAGE-XTM Standard Cell Library Databook,” Jan. 2004.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0801111-105514.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS