Responsive image
博碩士論文 etd-0514118-221744 詳細資訊
Title page for etd-0514118-221744
論文名稱
Title
於多核心處理器平台平行模擬多核目標環境且具效能分析之QEMU-SystemC模擬器
Efficient Timed QEMU-SystemC Parallel Emulator for Multiprocessor Target on Multicore Host Platform
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
67
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2018-01-19
繳交日期
Date of Submission
2018-06-15
關鍵字
Keywords
平行化仿真模擬器、效能分析、多核心處理器、硬體模擬器、軟硬體協同驗證
Multiprocessor, SystemC, Performance Analysis, SH/HW Co-verification, QEMU
統計
Statistics
本論文已被瀏覽 5729 次,被下載 64
The thesis/dissertation has been browsed 5729 times, has been downloaded 64 times.
中文摘要
隨著多核心處理器嵌入式系統的快速發展,軟硬體的設計與驗證變的越來越複雜,於是軟硬體整合人員將軟硬體整合與驗證的階段從Register Transfer Level (RTL)拉高抽象層及至Electronic System Level (ESL),在ESL對軟硬體進行整合與驗證可以大幅加速產品上市的時間。目前在ESL中廣泛被使用的QEMU-SystemC模擬平台可以模擬出市面上許多不同的多核心處理器平台架構,目前最新的QEMU所模擬之目標平台多核心處理器模擬已經能依據模擬的核心數平行的對應到主機端執行模擬工作的主機端上的執行緒,因此模擬多核心環境能平行執行於主機端。另外由於原始QEMU之模擬僅在抽象之行為層次,因此很難探討整個模擬過程中,軟硬體之間的執行狀況及時間效能,無法讓使用者了解軟體與開發中硬體之間的效能瓶頸。因此,本論文希望利用QEMU所模擬的目標平台的多核心處理器以多執行緒(Multi-thread) 的方式在主機端的多核心處理器上平行執行,以充分利用主機端的運算資源來加快模擬的速度。此外在此多核心處理器QEMU-SystemC模擬平台中加入目標平台之時間模型,使得在模擬過程中能夠估算目標平台的軟/硬體之執行時間來進行系統之效能分析。本論文提出一個能多核心處理器及硬體加速器的目標平台並有效地評估軟硬體行為及時間效能的平行化QEMU-SystemC模擬平台。
Abstract
With the rapid development of multi-processor embedded system, design and verification of hardware/software become more complex. The developers of software and hardware integration raise the abstraction level of validation and integration phase from Register Transfer Leve (RTL) to Electronic System Level (ESL). The process of validation and integration can be accelerated significantly in ESL. QEMU-SystemC emulation platform is widely used in the ESL, and it can emulate many different multi-processor architectures for performance analysis or behavior verification. The latest version of QEMU can emulate the target processors which already mapping to the thread of the host machine for executing the emulation workload. Thus, each emulated CPU on target platform can be executing on the host machine in parallel. This thesis demonstrates a full system emulator, which can target the multi-processor architecture with QEMU and SystemC, make emulated virtual CPUs runs on different threads to take full advantage of host machine computing resources, increasing the performance of emulator, also include the timing engine which can estimate the performance of the target system.
目次 Table of Contents
論文審定書 i
論文聲明書 ii
致謝 iii
中文摘要 v
Abstract vi
Contents vii
List of Figures ix
List of Tables x
Chapter 1. Introduction 1
1.1 Motivation 1
1.2 Background 2
1.3 Organization of the Thesis 7
Chapter 2. Related Work 8
2.1 Full-System Emulator 8
2.1.1 Sequential Emulator 8
2.1.2 Parallel emulator 10
2.2 Hardware-Software Co-simulator 16
2.3 QEMU-SystemC Timing Engine 17
Chapter 3. Performance analysis in Multi-threaded QEMU-SystemC 19
3.1 Overview of the Timing Engine with Multi-threaded QEMU-SystemC Platform 20
3.2 Executed TB Tracing in Runtime 21
3.3 Multiprocessors Timing Engine in Multi-threaded QEMU-SystemC 25
3.4 Analysis Event Trace Monitor 28
Chapter 4. Experiment Environment and Result 32
4.1 Based Emulator Selection 32
4.2 Experiment Setup 35
4.3 Evaluation Accuracy of Timing Engine 36
4.4 Overhead of Timing Engine 41
4.4.1 QEMU VS QEMU-SystemC with Timing Engine 41
4.4.2 Timing Engine with disabling the Memory simulator 42
4.4.3 Timing Engine with disabling Analysis Event Trace Monitor 44
4.5 Discussion of the Experiment Result 45
Chapter 5. Conclusion 46
Chapter 6. Future Work 47
References 48
Appendix A. Overhead of Timing engine in Hard Disk 53
參考文獻 References
[1] QEMU, http://wiki.qemu.org/Index.html.
[2] M. Mont´on, A. Portero, M. Moreno, B. Mart´ınez, and J. Carrabina, “Mixed SW/SystemC SoC emulation framework,” in Proceedings of IEEE International Symposium on Industrial Electronics, pp. 2338–2341, June 2007.
[3] MTTCG, https://wiki.qemu.org/Features/tcg-multithread.
[4] P. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner, “Simics: A full system simulation platform,” IEEE Computer, vol. 35, no. 2, pp. 50–58, Feb 2002.
[5] M. M. K. Martin, D. J. Sorin, B. M. Beckmann, M. R. Marty, M. Xu, A. R. Alameldeen, K. E. Moore, M. D. Hill, and D. A. Wood, “Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset,” SIGARCH Comput. Archit. News, vol. 33, no. 4, pp. 92–99, November 2005.
[6] Carl J. Mauer, Mark D. Hill, and David A. Wood, “Full System Timing-First Simulation,” Proc. ACM SIGMETRICS, p108, June 2002.
[7] N. Binkert, B. Beckmann, G. Black, S. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. Hill, and D. Wood, “The gem5 simulator,” ACM SIGARCH Computer Architecture News, vol. 39, no. 2, pp. 1–7, May 2011.
[8] SystemC - The Language for System-Level Modeling, Design and Verification, http://www.accellera.org/downloads/standards/systemc/about_systemc/.
[9] E. Argollo, A. Falcon, P. Faraboschi, M. Monchiero, and D. Ortega, ´ “Cotson: infrastructure for full system simulation,” ACM SIGOPS Operating Systems Review, vol. 43, no. 1, pp. 52–61, January 2009
[10] J.-H. Ding, Y.-C. Chung, P.-C. Chang, and W.-C. Hsu, “PQEMU: A parallel system emulator based on QEMU,” In 1st International QEMU Users Forum, 2011.
[11] IEEE/ANSI Std 1003.1: Information Technology-- (POSIX®) Part 1: System Application: Program Interface (API) [C Language], includes (1003.1a, 1003.1b, and 1003.1c). 1996
[12] Z.Wang, R. Liu, Y. Chen, X.Wu, H. Chen, W. Zhang, and B. Zang, “COREMU:a scalable and portable parallel full-system emulator,” In Proc. PPoPP, 2011.
[13] J. Miller, H. Kasture, G.Kurian, C. Gruenwald, N. Beckmann, C. Celio, J. Eastep, and A. Agarwal, “Graphite: A distributed parallel simulator for multicores,” in Proceedings of the 16th International Symposium on High-Performance Computer Architecture, pp. 1–12, 2010.
[14] T. Carlson, W. Heirman, and L. Eeckhout, “Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation,” in Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, 2011.
[15] D. Sanchez and C. Kozyrakis, “Zsim: fast and accurate microarchitectural simulation of thousand-core systems,” in 40th Annual International Symposium on Computer Architecture, pp. 475–486, 2013.
[16] J. Wang, J. Beu, R. Bheda, T. Conte, Z. Dong, C. Kersey, M. Rasquinha, G. Riley, W. Song, H. Xiao, P. Xu, and S. Yalamanchili, “Manifold: A parallel simulation framework for multicore systems,” IEEE Int’l Symp. Perf. Anal. Syst. Softw., Mar. 2014.
[17] Chun-Hao Wang, “Heterogeneous QEMU-SystemC Integration for Timed CPU/Cache/MMU/DRAM/Component Simulation: A case study in 3D Graphics SoC,” Department of Comupter Science & Engineering National Sun Yat-Sen University Master Thesis, June 2012.
[18] Ming-Shiun Yang, "Performance Analysis for Multiprocessor Target Platform in QEMU-SystemC," Department of Comupter Science & Engineering National Sun Yat-Sen University Master Thesis, June 2017.
[19] S. Iqbal, Y. Liang, and H. Grahn, “ParMiBench - an open-source benchmark for embedded multiprocessor systems,” Computer Architecture Letters, 2010.
[20] M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. MiBench: A Free, Commerically Representative Embedded Benchmark Suite. In Proceedings of the 4th Work. on Workload Characterization, pp 83–94, 2001.
[21] ARM CoreTile Express Cortex™-A15_A7 MPCore Technical ReferenceManual,http://infocenter.arm.com/help/topic/com.arm.doc.ddi0503i/DDI0503I_v2p_ca15_a7_tc2_trm.pdf
[22] Linaro, https://www.linaro.org/downloads/
[23] BusyBox, https://busybox.net/about.html
[24] ARM Cortex-A15 MPCore™ Processor Technical Reference Manual, http://infocenter.arm.com/help/topic/com.arm.doc.ddi0438i/DDI0438I_cortex_a15_r4p0_trm.pdf
[25] E. Vasilaki, “An instruction level energy characterization of arm processors” Institute of Computer Science (ICS), Foundation of Research and Technology Hellas (FORTH), Tech. Rep. FORTHICS/TR-450, 2015.
[26] Perf, https://perf.wiki.kernel.org/index.php/Main_Page
[27] HSU, W., HUNG, S., AND TU, C. “A virtual timing device for program performance analysis”. In Proceedings of the 10th IEEE International Conference on Computer and Information Technology. Vol. 2, 2255–2260, 2010.
[28] S.-h. Kang, D. Yoo, and S. Ha. “TQSIM: A fast cycle-approximate processor simulator based on QEMU”,. Journal of Systems Architecture, Volumes 66–67, pp 33–47, May 2016.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code