國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,具SIMD架構之多功能多重精確度四維內積運算單元,A Multi-functional Multi-precision 4D Dot Product Unit with SIMD Architecture

論文名稱 Title	具SIMD架構之多功能多重精確度四維內積運算單元 A Multi-functional Multi-precision 4D Dot Product Unit with SIMD Architecture
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	103 學年度第 2 學期 The spring semester of Academic Year 103	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	70
研究生 Author	張銘峰 Ming-Fong Chang
指導教授 Advisor	鄺獻榮 Shiann-Rong Kuang
召集委員 Convenor	蕭勝夫 Shen-Fu Hsiao
口試委員 Advisory Committee	張雲南, 陳銘志 Yun-Nan Chang; Ming-Chih Chen
口試日期 Date of Exam	2015-07-27	繳交日期 Date of Submission	2015-08-04
關鍵字 Keywords	單指令流多資料流、多重精確度、低功率、四維內積、時脈閘控 Single Instruction Multiple Data, Multi-precision, 4D Dot Product, Clock Gating, Low Power
統計 Statistics	本論文已被瀏覽 5681 次，被下載 97 次 The thesis/dissertation has been browsed 5681 times, has been downloaded 97 times.

中文摘要
在圖形處理器中，頂點著色器是相當重要的元件，主要負責頂點的座標轉換與光源的幾何運算。當圖形處理器進行三維立體圖像的處理時，頂點著色器經常需要計算大量的浮點數運算，因此本論文根據IEEE二進位浮點算術標準的單精度規格，提出一個具有SIMD架構之多功能多重精確度四維內積運算單元。本論文的四維內積運算單元具有多功能運算的特性，所以可以支援五種指令的運算，分別為浮點加法、浮點乘法、浮點乘加法、浮點三維內積以及浮點四維內積。多功能運算的設計是為了可以在一個四維內積運算單元的硬體中執行五種指令，相當於將五種獨立的運算單元融合成一個硬體以達到節省面積的效果。除了多功能運算之外，使用者還可以按照需求來選擇四種精確度模式，包含最高精確度模式(23位元)、中高精確度模式(18位元)、中低精確度模式(13位元)以及最低精確度模式(7位元)，雖然在低精確度模式下會呈現誤差較高的圖像，但是由於人眼無法清楚得辨識圖像的些微失真，因此在可以接受失真的範圍內，進而關閉部分的電路，並且減少訊號的切換次數以達到低功率與省電的效果。
Abstract
In modern graphics processing unit, the vertex shader is a quite important component. It is mainly responsible for the coordinate transformation and light’s geometric operations. When a graphics processing unit executes 3D graphics pipelining to generate a 3D image, the vertex shader usually has to perform a lot of floating-point arithmetic operations. Therefore, this thesis, proposes a scheme which is suitable for a multi-functional multi-precision 4D dot product unit with single instruction multiple data architecture compliant with IEEE754 standard for a single-precision floating-point arithmetic. The proposed 4D dot product unit can perform multiple instruction, including floating-point addition, floating-point multiplication, floating-point multiply-add, floating-point 3D dot product and floating-point 4D dot product. Multi-function means you can use one 4D dot product unit to perform one of five floating-point instruction. It is equivalent to fuse five independent arithmetic units into a hardware, to reduce area. In addition to multi-function, it also provides users four floating-point operation’s precision modes, which are 23-Bit, 18-Bit, 13-Bit, and 7-Bit. In low precision mode, graphics processing unit will generate distorted image, but human eyes can’t clearly identify a slight distortion of the rendered image. As a result, power and energy savings can be achieved by turning off a part of circuit and reducing the switching activities when a little image distortion is allowable.

目次 Table of Contents
論文審定書 i 摘要 ii Abstract iii 目錄 iv 圖次 vii 表次 ix 第一章概論 1 1.1 研究動機 1 1.2 論文大綱 2 第二章研究背景 3 2.1 IEEE二進位浮點算數標準 3 2.2 基礎浮點算數原理 4 2.2.1 浮點乘法 4 2.2.2 浮點加法 5 2.3 浮點運算單元 6 2.3.1 簡介 6 2.3.2 多功能運算 6 2.3.3 布斯編碼乘法器 7 2.3.4 壓縮器 13 2.3.5 移位器 14 2.3.6 數值簡化 16 2.3.7 四維內積運算單元 18 第三章 SIMD四維內積運算單元 22 3.1 傳統式SIMD四維內積運算單元 22 3.1.1 單指令流多資料流 22 3.1.2 傳統式SIMD架構 23 3.2 改良式SIMD四維內積運算單元 26 3.2.1 改良式SIMD架構 26 3.2.2 加法指令 28 3.2.3 乘法指令 30 3.2.4 乘加法指令 32 3.2.5 三維內積指令 34 3.2.6 四維內積指令 35 3.3 最佳化SIMD四維內積運算單元 36 3.3.1 最佳化SIMD架構 36 3.3.2 第一次數值簡化 38 3.3.3 反相器 39 3.4 多重精確度 43 3.4.1 概念 43 3.4.2 精確度模式 44 3.4.3 時脈閘控 46 3.4.4 乘法器 48 第四章實驗結果與數據分析 49 4.1 實驗步驟與驗證 49 4.2 多重精確度之功率分析 50 4.3 多重精確度之圖像分析 53 4.4 各版本電路之延遲與面積分析 57 第五章結論與未來工作 58 5.1 結論 58 5.2 未來研究方向 58 參考文獻 59

參考文獻 References
[1] “IEEE Standard for Floating-Point Arithmetic,” 2008. [2] Sharma, M., Verma, R., “Disposition (reduction) of (negative) partial product for Radix 4 Booth's Algorithm,” 2011 World Congress on Information and Communication Technologies (WICT), pp. 1169-1174, 2011. [3] Zhijun Huang, “High-Level Optimization Techniques for Low-Power Multiplier Design,” PhD dissertation, Univ. of California, Los Angeles, June, 2003. [4] D. Radhakrishman and A. P. Preethy, “Low-power CMOS pass logic 4-2 compressor for high-speed multiplication,” 43rd IEEE Midwest Symp. on Circuits & Systems, Vol. 3, pp. 1296-1298, 2000. [5] P. Mokrian, G. Howard, G. Jullien, and M. ahmadi, “On the use of 4:2 compressor for partial product reduction,” IEEE Canadian Conf. on Electrical and Computer Engineering, Vol. 1, pp. 121-124, May, 2003. [6] D. Villeger and V. Oklobdzija, “Analysis of Booth Encoding Efficiency in Parallel Multipliers Using Compressors for Reduction of Partial Products,” 27th Ann. Asilomar Conf. on Signals, Systems, and Computers, Vol. 1, pp. 781-784, 1993. [7] Donghyun Kim, Lee-Sup Kim, “A Floating-Point Unit for 4D Vector Inner Product with Reduced Latency,” IEEE Transactions on Computers, pp. 890-901, 2009. [8] Yisong Chang, Jizeng Wei, Wei Guo, Jizhou Sun, “A Multi-Functional Dot Product Unit with SIMD Architecture for Embedded 3D Graphics Engine”, 2011 IEEE 54th International Midwest Symposium on Circuits and Systems (MWSCAS), pp. 1-4, 2011. [9] 程建綱, “適用於多媒體應用的多重精確度函數差補器,” 國立中山大學資訊工程學系碩士論文, 2012. [10] 李鈺珊, “應用於三維繪圖處理器之低功率指令模式調整系統,” 國立中山大學資訊工程學系碩士論文, 2014. [11] 林柏廷, “可用於三維圖形運算之低功率多重精確度功能單元產生器,” 國立中山大學資訊工程學系碩士論文, 2014.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0630115-164853.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS