Responsive image
博碩士論文 etd-0726113-030940 詳細資訊
Title page for etd-0726113-030940
論文名稱
Title
設計於超多純量架構中以偵測執行緒並行度之優化群組管理核心
Design of the Optimized Group Management Unit by Detecting Thread Parallelism on the Hyperscalar Architecture
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
72
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2013-07-29
繳交日期
Date of Submission
2013-08-26
關鍵字
Keywords
重新組態、單晶片多核心、超純量、超多純量、指令並行度
superscalar, hyper-scalar, chip multiprocessors, reconfigure, ILP
統計
Statistics
本論文已被瀏覽 5717 次,被下載 728
The thesis/dissertation has been browsed 5717 times, has been downloaded 728 times.
中文摘要
單晶片多核心處理器(Chip Multiprocessor)已成為現今處理器設計的主流。傳統單晶片多核心系統中,單晶片多核心處理器能透過內部的單一處理器核心架構來探勘指令層級並行度(Instruction Level Parallelism, ILP),並且能透過多顆處理器的並行運算來探勘執行緒層級並行度(Thread Level Parallelism, TLP)。然而傳統單晶片多核心架構必須在硬體設計規劃之初,在高單一執行緒效能與高生產量做取捨,無法動態的調整指令層級並行度與執行緒層級並行度的探勘能力,造成了目前單晶片多核心處理器面對未來多變的應用程式類型處理上的效率不彰。
因此本論文所使用的架構基礎名為超多純量(Hyperscalar)架構,此架構是一種單晶片多核心的微處理器系統架構,能動態群組多顆處理器核心為一個運算能力較高之超純量核心,而重新組態的特性讓多核心處理架構擁有高度彈性,當執行緒層級並行度低時,透過多核心共同運行而提高單一執行緒效能,反之則透過多核心獨立運作提供高生產量。
為了使處理器使用效率提升,將動態偵測執行緒的ILP變化,使得系統可依照指令並行度高低來群組或釋放處理器群。本論文以Hyperscalar為基礎,加入判斷執行緒並行度機制,並使用新增的兩道指令CRM(Core Register Move)和RelC(Release Core)釋放處理器核心。CRM指令能將資料從要被釋放掉的核心搬移到此群組的其他核心中,以確保釋放核心後群組內的資料正確性;RelC指令表示將此顆核心釋放,當此指令到達WB stage會發送release訊號給群組管理核心單元,告知核心已完全清空且資料全數轉移完成。群組管理核心單元派發這兩道指令後,系統即可依據執行緒並行度來釋放或群組核心數量。經過實測結果可達到處理器使用率的提升以及整體工作效率的提升。
Abstract
Current trends in processor design have migrated toward chip multiprocessors (CMPs). CMPs are designed to exploit both instruction-level parallelism (ILP) within processors and thread-level parallelism (TLP) within and across processors. However, the conventional design of current CMPs is forced to make a choice between high single-thread performance and high peak throughput. This inability to adjust to varying levels of ILP and TLP results in processor inefficiency.
Therefore, this paper is based on the hyperscalar architecture which is a chip multiprocessor. The hyperscalar concept enables the multi-core architectures to dynamically group many scalar in-order cores as a superscalar processor to accelerate a sequential thread. The reconfigure feature of hyperscalar architecture contributes to the high flexibility in adapting different types of applications, providing high single-thread performance when thread level parallelism (TLP) is low and high throughput when TLP is high.
In order to increase the efficient of the processors, the system will dynamically detect the ILP of the thread. And according to the difference of the ILP, it will group or release the processors. Based on the hyperscalar architecture, this thesis adds the mechanism which can detect the ILP of thread. And the two new instructions CRM (Core Register Move) and RelC (Release Core) can release the processors of the group. To ensure the data accuracy within the group after release the core, CRM instruction move the information from the core which is released to the other core in this group; RelC instruction indicates to release the core. When this instruction executes in the WB stage, it will send a release signal to Group-Management-Unit (GMU) to notify the data has been completely transferred and the core is empty. After GMU dispatches these two instructions, the system will release or group the cores according to the ILP. Simulation results show that the proposed architecture can increase the use of the processors and improve the work efficiency.
目次 Table of Contents
論文審定書 i
致謝 ii
中文摘要 iii
Abstract iv
圖片列表 viii
表格列表 x
第一章 簡介 1
1.1 研究動機 1
1.2 研究目標 2
1.3 論文架構 3
第二章 相關研究 4
2.1 單一核心架構介紹 4
2.2目前多核心處理器架構 6
2.2.1 多核心增進多執行緒效能之架構 6
2.2.2 多核心增進單執行緒效能之架構 7
2.3 超多純量(Hyper-scalar)架構介紹 14
2.2.1 指令分析器 16
2.2.2 虛擬共享暫存器檔案 19
2.2.3 Register data flow的處理 21
2.2.4 Memory data flow的處理 23
2.2.5 Instruction flow的處理 25
第三章 設計於超多純量架構下以偵測執行緒並行度之優化群組管理核心 28
3.1 具優化群組管理核心之超多純量系統架構 28
3.1.1系統架構設計概念 28
3.1.2 系統架構 29
3.2 系統操作模式 30
3.2.1 處理器群組方式 30
3.2.2 新增指令 30
3.3 群組管理核心單元 33
3.4 資訊處理單元 38
3.5 指令運作範例 43
第四章 模擬與分析 50
4.1架構模擬 50
4.1.1模擬器架構之程式碼流程 50
4.1.2 判斷ILP之程式碼流程 52
4.1.3效能評估模擬方法 52
4.1.3效能評估程式 54
4.2模擬結果 55
4.2.1 不同效能評估程式在不同核心數目下之ILP 55
4.2.2加入優化群組管理核心單元架構與超多純量架構效能增進關係 55
第五章 結論 57
參考文獻 59
參考文獻 References
[1]. John L. Hennessy, David A. Patterson, “Computer Architecture : A Quantitative Approach”, 3rd ed., Morgan Kaufmann, 2003
[2]. R. Kalla, Balaram Sinharoy, J.M. Tendler, “IBM Power5 Chip: A Dual-Core Multithread Processor”, IEEE Micro, vol. 24, No. 2, pp. 40 – 47, March/April 2004
[3]. T. Takayanagi, J. L. Shin, B. Petrick, J. Y. Su, H. Levy, Ha Pham; J. Son, N. Moon, D. Bistry, U. Nair, M. Singh, V. Mathur, A. S. Leon, “A dual-core 64-bit ultraSPARC microprocessor for dense server applications”, IEEE Journal of Solid-State Circuits, vol. 40, pp. 7-18, Jan. 2005
[4]. L Peng, JK Peir, TK Prakash, YK Chen, D Koppelman, “Memory Performance and Scalability of Intel's and AMD's Dual-Core Processors: A Case Study”, Performance, Computing, and Communications Conference, IPCCC, IEEE Internationa April 2007, pp. 55 – 64
[5]. Z. Purser, K. Sundaramoorthy, and E. Rotenberg, “A study of Slipstream Processors”. Proceedings of the 33rd annual ACM/IEEE international, 2000, pp. 269 – 280
[6]. K. Sundaramoorthy, Z. Purser, and E. Rotenberg. “Slipstream processor: improving both performance and fault tolerance”, ACM SIGPLAN Notices, vol 35, pp. 257 – 268, 2000
[7]. KZ Ibrahim, GT Byrd and E. Rotenberg, “Slipstream execution mode for CMP-based multiprocessors”, High-Performance Computer Architecture, 2003. HPCA-9 2003, pp. 179- 190
[8]. ST Srinivasan, H Akkary, T Holman, K Lai., “A minimal dual-core speculative multi-threading architecture”, Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings. IEEE International Conference 2004, pp. 360-367
[9]. L Wang, CL Wu, “Distributed Instruction Set Computer Architecture”, IEEE Transactions on Computers, 1991, vol. 40, pp. 915-934
[10]. GS Sohi, SE Breach, TN Vijaykumar, “Multiscalar Processor”, 22nd Annual International Symposium on Computer Architecture, 1995, pp. 414- 425
[11]. M. Franklin, “The Multiscalar Architecture” Ph.D. Thesis, Computer Science Technical Report #1196,
[12]. H Zhou, “Dual-core execution: building a highly scalable single-thread instruction window”, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05), pp. 231-242
[13]. Congy J. ; Hany G. ; Jagannathan A. ; Reinmany G. ; Rutkowski K. ; “Accelerating Sequential Applications on CMPs Using Core Spilling”, IEEE Transactions On Parallel and Distributed Systems : Accepted for future publication, 2007
[14]. JC Chiu, YL Chou, PK Chen, “A Superscalar Dual-Core Architecture for ARM ISA”, Proceedings of the International Computer Symposium 2006, pp. 21-26, Dec. 2006
[15]. T Shimada, “On new generation dataflow architecture”, Design and Application of Parallel Digital Processors, 1988., International Specialist Seminar on the 11-15 Apr 1988 Page(s):112 – 115
[16]. B Lee, AR Hurson, “Dataflow architectures and multithreading”, Computer Volume 27, Issue 8, Aug. 1994 Page(s):27 – 39
[17]. JE Smith, GS Sohi “The microarchitecture of superscalar processors”, Proceedings of the IEEE Vol. 83, Issue 12, Dec. 1995 pp. 1609 – 1624
[18]. M. R. Guthaus, J. S. Ringenberg, F. Ernst, T. M. Austin, T. Mudge, and R. B. Brown., “Mibench: A free, commercially representative embedded benchmark suite”, in Proceedings of the IEEE 4th Annual Workshop on Workload Characterization, Dec. 2001, pp.3-14
[19]. Po-Kai Chen, “ESL Model of the Hyper-scalar Processor on a Chip”,2007 ,Department of Electrical Engineering National Sun Yat-Sen University
[20]. M. Horowitz and W. Dally, "How scaling will change processor architecture," Solid-State Circuits Conference, 2004.Digest of Technical Papers.ISSCC.2004 IEEE International, Vol.1, pp. 132-133, 2004.
[21]. Yu-Lian Chou, “Study of the Hyperscalar Multi-core Architecture”,2011 ,Department of Electrical Engineering National Sun Yat-Sen University
[22]. B. A. Nayfeh and K. Olukotun, "A single-chip multiprocessor," Computer, IEEE, vol. 30, pp. 79-85, 1997.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code