Responsive image
博碩士論文 etd-0910107-093301 詳細資訊
Title page for etd-0910107-093301
論文名稱
Title
以匯流排為基礎之系統單晶片嵌入式電路模擬及追蹤之整合
Embedded In-Circuit Emulation and Tracing for Bus-based System-on-Chip Integration
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
77
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2007-07-24
繳交日期
Date of Submission
2007-09-10
關鍵字
Keywords
系統單晶片、匯流排、訊號追蹤、電路模擬器、除錯
In-Circuit Emulator, Debugging, On-Chip Bus, Signal Tracing, System-on-a-Chip
統計
Statistics
本論文已被瀏覽 5829 次,被下載 0
The thesis/dissertation has been browsed 5829 times, has been downloaded 0 times.
中文摘要
在系統單晶片(System-on-Chip, SoC)時代,學術界及工業界的研究指出,現今的晶片設計時程中,有70%的時間是用在晶片功能驗證及測試上。這表示,一個複雜的系統單晶片,設計所花費的時間已遠遠比不上功能驗證除錯的時間。為了能將產品及時上市,如何將系統功能驗證及除錯的時間有效率的減少,是現今SoC設計的一大挑戰。
在一個系統晶片中,微處理器或微控制器是最基本的元件。首先,我們將重點放在微處理器的功能驗證除錯上。一個嵌入式電路模擬器(ICE)模組可以整合至微處理器內。ICE的架構是基於IEEE Std. 1149.1 JTAG標準來發展,支援傳統的除錯測試功能,如:邊界掃瞄單元(boundary scan cell)、程式單步執行、內部資源監督、中斷點偵測等。因應針對不同的微處理器有不同的ICE架構,我們提出一個可適用於多數微處理器之ICE矽智產(Silicon IP, SIP)。此ICE提供參數讓使用者設定,來快速整合至不同的微處理器內。我們已成功將此ICE整合至兩個完全不同架構之微處理器/微控制器,分別是HT48x00 8位元微控器,及ARM7-like 32位元之微處理。在FPGA的雛型驗證及晶片的實作已完成,可證明我們提出的ICE不但可以方便使用者快速對微處理器進行除錯,亦可將此ICE整合至不同之微處理器上。
能夠及時(real-time)的追蹤程式執行指令以供後續之程式除錯及分析是重要的。然而,微處理器在執行程式時,及時所產生出的追蹤資料量非大,為了能有效的紀錄這追蹤資料,必須要有好的硬體的追蹤壓縮器。我們提出了硬體追蹤壓縮器,能夠及時追蹤程式執行之位址,並將這追蹤資訊做立即壓縮以減少資料紀錄量。這個方法包含了三個硬體模組: (1) 程式連續位址的濾除,只留下分支跳躍點, (2) 分支跳躍點之資料編碼,(3) 以Lempel-Ziv(LZ)為基礎之資料壓縮方法。實驗結果顯示,我們提出之硬體程式位址追蹤壓縮器可以達到454:1之追蹤資料壓縮(real-time),比一般方法之5:1要更有效率。同時,我們提供使用者參數化調整之功能,使用者可以在硬體成本及壓縮率之間取捨;越好的壓縮率需越多的硬體成本,反之亦然。
在系統晶片中,通常是由一或多個微處理器/數位訊號處理器、記憶體、使用者自定功能模組及系統輸出/入裝置所組成。對於系統的除錯驗證,除了先針對各各模組自行證證外,整合後的系統更需要有一有效方法來進行除錯驗證。最常見的方法,即是對整個系統進行模擬,觀察其輸出/入的結果是不是符合預期。然而,此缺點是,當功能發生錯誤時,1) 不容易直接找到是哪一個模組在哪個時間點發生問題,2) 不易判斷是軟體程式寫錯或是硬體設計不完全符合匯流排的協定(protocol),3) 在晶片層次,使用者只能觀察I/O腳位訊號,而晶片內部的訊號無從得知。因此,匯流排訊號追蹤(bus signal tracing)是有效且常見的解決方法。訊號追蹤結果可忠實還原系統在執行時,欲觀察之訊號在匯流排上的變化時間及內容(值)。而訊號追蹤的結果不僅可以讓工程師容易了解程式執行流程以方便除錯,更可藉由訊號的變化來觀察所設計的功能模組是否完全符合匯流排協定,亦可分析整體功率(power)消耗及程式對記憶體存取行為等。
然而,系統匯流排在每個時間(cycle)所產生的資料量非常大,要及時將這些訊號資料傳出晶片外,非常不易。因此能在不失真(lossless)的情況下,儘量減少要記錄的資料量。為了解決上述問題,我們提出一可組態之嵌入式AMBA匯流排訊號/事件追蹤器(Configurable AMBA On-Chip Signal/Event Tracer)。(以下簡稱bus tracer)
Bus tracer以二階段方式來有效減少記錄資料量大小。第一階段為「訊號監視」(monitor),第二階段為「減少追蹤資料量」(trace reduction)。
在第一階段中,由使用者來決定所欲觀察訊號之詳細程度。我們把傳輸過程分成二面來做抽象化探討,分別是「時間」及「訊號」(timing/signal abstraction)。在系統執行過程中,我們把「時間」分成二個層次,分別是cycle level及transaction level。我們把「訊號」分成四個層次,分別是full signals、partial signals、master signals及customized signals。不同層次有不同大小的資料追蹤記錄量,且不同層次的選擇是可以在訊號追蹤期間隨時更改。在第二階段中,本IP使用了1) branch/target filtering,2) differential approach,及 3) signal encoding來有效減少追蹤記錄量。經由實驗結果,得到初步的追蹤資料壓縮率為96%。
在應用上,bus tracer整合至一3D graphics acceleration in digital television之系統晶片上,以提供日後晶片下線,使用者可藉由bus tracer得知晶片內部之匯流排訊號變化行為而進行除錯及效能分析。
Abstract
In the System-on-Chip (SoC) era, common industry estimates are that functional verification takes approximately 70% of the total effort on a project. For the time-to-market constrain, it’s a challenge to reduce the SoC verification/debugging time efficiently. In an SoC, a microprocessor is an essential part of it. First, we focus the debugging problem on microprocessors. An in-circuit emulation (ICE) module that can be embedded with a microprocessor core. The ICE module, based on the IEEE 1149.1 JTAG architecture, supports typical debugging and testing mechanisms, including boundary scan paths, partial scan paths, single stepping, internal resource monitoring and modification, breakpoint detection, and mode switching between debugging and normal modes. The architecture of the ICE module is parameterized and retargetable to different microprocessors. It has been successfully integrated with two microprocessors with significantly different architectures: one 8-bit industrial embedded microcontroller HT48x00 and one 32-bit ARM7-like embedded microprocessor. FPGA prototypes and chip implementation have been accomplished. Experiments show that real-time (on-line) debugging at full speed is possible with the embedded ICE at a minor gate count overhead.
Collecting the program execution traces at full speed is essential to the analysis and debugging of real-time software behavior of a complex system. However, the generation rate and the size of real time program traces are so huge such that real-time program tracing is often infeasible without proper hardware support. This paper presents a hardware approach to compress program execution traces in real time in order to reduce the trace size. The approach consists of three modularized phases: (1) branch/target filtering, (2) branch/target address encoding and (3) Lempel-Ziv-based data compression. A synthesizable RTL code for the proposed hardware is constructed to analyze the hardware cost and speed and typical multimedia benchmarks are used to measure the compression results. The results show that our hardware is capable of real time compression and achieving compression ratio of 454:1, far better than 5:1 achieved by typical existing hardware approaches. Furthermore, our modularized approach makes it possible to trade off between the hardware cost (typically from 1K to 50K gates) and the achievable compression ratio (typically from 5:1 to 454:1).
For SoC debugging, bus signal tracing represents that the information which is generated from the system can be collected for later observation, debugging and analysis. However, the generation rate and the size of real time system traces are so huge such that a mechanism for system tracing that can reduce trace size efficiently is needed. In this paper, we propose a multi-resolution bus trace approach. The hardware bus tracer consists of two major stages: (1) signal monitor & tracing stage, and (2) trace compression stage. In the first stage, designer can trace the signals in detail or in rough depends on the debug purpose. In other word, the multi-resolution trace approach provides the trade-off between trace accuracy and trace depth. In the second stage, the bus tracer compresses the trace size efficiently; therefore the capability of on-chip storage is increased. In the host, the analyzer tool decompresses the trace data for future observation and debugging.
目次 Table of Contents
Chapter 1. Introduction 1
1.1 Background 1
1.2 Motivations 3
1.3 Contributions 4
Chapter 2. Related Works 5
2.1 In-Circuit Emulation (ICE) 5
2.2 Program Address Trace Compression 7
Chapter 3. In-Circuit Emulation/Emulator 10
3.1 Fundamental IEEE 1149.1 JTAG Components 10
3.2 Extension to the JTAG components 11
3.3 Breakpoint Detection Unit (BDU) 13
3.4 ICE Interface to the Microprocessor Core 14
3.5 ICE Operation Modes 17
3.6 ICE Integration and Prototyping 18
3.7 Case Studies 22
3.8 Summary 25
3.9 Acknowledgement 錯誤! 尚未定義書籤。
Chapter 4. Program Address Trace Compression 26
4.1 Phase P1: Branch/Target Filtering 27
4.2 Phase P2: Branch/Target Address Encoding 29
4.3 Phase P3: Data Compression 36
4.4 Trace Decompression Method 38
4.5 Design Parameters and Configurations 40
4.6 Experiments and Analysis 41
4.7 Summary 48
Chapter 5. Bus Signal Tracing and Analyzing 49
5.1 Bus Tracer Architecture 49
5.2 Hardware Implementation and Verification 55
5.3 Experimental Results and Analysis 56
5.4 Summary 61
Chapter 6. Conclusion and Future Works 62
6.1 Future works 63
References 65
參考文獻 References
[1] I.-J. Huang, et al. “A Retargetable Embedded In-circuit Emulation Module for Microprocessors,” Extended version of this paper, http://esl.cse.nsysu.edu.tw/paper/d&t200207_ext.pdf.
[2] HT48100 Development Data Book, Holtek Microelectonics, Dec. 1994.
[3] IEEE Std. 1149.1a-1993, Test Access Port and Boundary-Scan Architecture, IEEE Piscataway, N.J., 1993.
[4] The ARM7TDMI Debug Architecture Application Note, no. 28, Advanced RISC Machines Ltd., Dec. 1995.
[5] ARM 7TDMI Data Sheet, Advanced RISC Machines Ltd., 1995.
[6] Microtek International, “ICE Production Information,” MICE functional description, http://www.adara.com.tw/mice_index.htm
[7] P. C. Ching, Y. H. Cheng and M. H. Ko, “An In-Circuit Emulator for TMS320C25,” IEEE Transactions on Education, vol. 37, no. 1, Feb. 1994, pp. 51-56.
[8] Intel Ltd., Debugging support (Embedded Intel386DX Microprocessor) Data Sheet, Intel, 1995.
[9] Henry Neugass, “Approaches to on-chip debugging,” Computer Design http://www.computer-design.com/Editorial/1998/12/TDP/decdebug.html
[10] O’Keeffe Hugh, “The NEXUS Standard: Providing the Gateway to the Embedded Systems of the Future,” http://www.ashling.com/pdf_papers/NexusGateway.PDF
[11] G. R. Alves and J. M. Martins Ferreira, “From Design-for-Test to Design-for-Debug-and-Test: Analysis of Requirements and Limitations for 1149.1,” Proc. 17th IEEE VLSI Test Symposium, Dana Point, CA, USA, 1999, pp. 473 -480.
[12] IBM, RISCWatch debugger, http://www-3.ibm.com/chips/products/powerpc/tools/riscwatc.html
[13] E. E. Johnson, J. Ha and B. Zaidi, "Lossless Trace Compression," IEEE Transactions on Computers, vol. 50, no. 2, February 2001, pp. 158-173.
[14] C. D. Schieber and E. E. Johnson, "RATCHET: Real-Time Trace Compression Hardware for Extended races," ACM Performance Evaluation Review, vol. 21, no. 3-4, April 1994, pp. 22-32.
[15] A. D. Samples, "Mache: No-Loss Trace Compression," ACM Performance Evaluation Review, vol. 17, no. 1, May 1989, pp. 89-97.
[16] IEEE Industry Standards and Technology Organization (IEEE-ISTO): IEEE-ISTO 5001 1999, the Nexus 5001 Forum Standard for a Global Embedded Processor Debug Interface, available at http://www.nexus5001.org/
[17] Richard A. Uhlig and Trevor N. Mudge, "Trace-driven Memory Simulation: A Survey," ACM Computing Surveys, vol. 29, no. 2, June 1997, pp. 128-170.
[18] C. MacNamee and D. Heffernan, “Emerging On-Chip Debugging Techniques for Real-Time Embedded Systems,” IEE Computing & Control Engineering Journal, Dec. 2000, pp. 295-303.
[19] Ziv, J. and Lempel, A., “A Universal Algorithm for Sequential Data Compression,” IEEE Transactions On Information Theory, vol. IT-23, no.3, 1977, pp. 337-343.
[20] Wei-Je Huang, Nirmal Saxena, et.al, “A Reliable LZ Data Compressor on Reconfigurable Coprocessors,” IEEE Symposium on Field-Programmable Custom Computing Machines, 2000, pp. 249-259.
[21] Shyh-Ming Huang, Ing-Jer Huang and Chung-Fu Kao, "Reconfigurable Real-Time Address Trace Compressor for Embedded Microprocessors," Proceedings of 2003 IEEE International Conference on Field-Programmable Technology (ICFPT'03), Dec. 2003, pp. 196-203.
[22] Sumit Kasera and Navita Jain, "A Survey of Lossless Data Compression Techniques," Technical Papers and Presentations, http://skasera.tripod.com/papers/compression.doc, Mar. 2004.
[23] A. Mayer, H. Siebert, K.D. McDonald-Maier, "Debug Support, Calibration and Emulation for Multiple Processor and Powertrain Control SoCs," Proceedings of the Conference on Design, Automation and Test in Europe, vol. 3, March 07-11, 2005, pp. 148-152.
[24] ARM Embedded Trace Macrocell Architecture Specification, available at http://www.arm.com/pdfs/IHI0014J_ETM_ArchSpec.pdf
[25] J.-M. Chen and C.-H. Wei, “VLSI design for high-speed LZ-based data compression,” IEE Proceedings of Circuits, Devices and Systems, vol. 146, no. 5, October 1999, pp. 268-278.
[26] E. Rotenberg, S. Bennett, and J. E. Smith, “A trace cache microarchitecture and evaluation,” IEEE Transactions on Computers, vol 48, no 2, Feb. 1999, pp: 111-120.
[27] E. Hao, Po-Yung Chang; M. Evers, and N. Y. Patt, “Increasing the instruction fetch rate via block-structured instruction set architectures,” Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture, 2-4 Dec. 1996, pp. 191-200.
[28] J. Huang and D. J. Lilja, “Extending value reuse to basic blocks with compiler support,” IEEE Transactions on Computers, vol 49, no 4, April 2000, pp. 331-347.
[29] I. J. Huang, C. F. Kao, H. M. Chen, J. N. Ruan and T. A. Lu, “A Retargetable Embedded In-circuit Emulation Module for Microprocessors,” IEEE Design and Test of Computers, July/August 2002, pp. 28-38.
[30] Andrew B. T. Hopkins and K. D. McDonald Maier, “Debug support strategy for systems-on-chips with multiple processor cores,” IEEE Transactions on Computers, vol. 55, no. 2, Feb. 2006, pp. 174-184.
[31] Section 21, development support, MPC 555 user’s manual, Freescale Semiconductor. Available at http://www.freescale.com/files/microcontrollers/doc/user_guide/MPC555UM.pdf
[32] System Units section 20, on-chip debug support, TC1775 TriCore User's Manual, Infineon Technologies. Available at http://www.infineon.com/upload/Document/cmc_upload/0/000/015/897/tc1775_umsu_v20.pdf
[33] David A. Patterson and John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3rd ed., Morgan Kaufmann Publisher, 2005. (Section 2.4 for MIPS’s instruction set, Section 2.16 for Intel’s x86 instruction set)
[34] Introduction to Lempel-Ziv Compressor. Available at
http://eslab.cse.nsysu.edu.tw/~cfkao/lz_compressor.pdf
[35] Section 2, processor architecture, Nios II Processor Reference Handbook (ver 6.0, May 2006), Altera. Available at http://www.altera.com/literature/hb/nios2/n2cpu_nii5v1.pdf
[36] Chapter 6, trace compression, The PDtrace™ Interface and Trace Control Block Specification, MIPS Technologies. Available at http://www.mips.com/content/Documentation/MIPSDocumentation/ProcessorArchitecture/PCandDataTraceSpecificationfortheMIPSArchitecture/MD00439-2B-PDTRACETCB-SPC-04.10.pdf
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外均不公開 not available
開放時間 Available:
校內 Campus:永不公開 not available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 3.21.231.245
論文開放下載的時間是 校外不公開

Your IP address is 3.21.231.245
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code