Responsive image
博碩士論文 etd-0402118-084552 詳細資訊
Title page for etd-0402118-084552
論文名稱
Title
即時偵測及修復微處理器之控制流錯誤的機制
Critical Signature Assertion and On-the-Fly Recovery for Control Flow Errors in Processors
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
89
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2018-01-19
繳交日期
Date of Submission
2018-05-02
關鍵字
Keywords
控制流錯誤、微處理器、容錯、簽章、檢查點和恢復、資料錯誤
Microprocessor, Fault-Tolerant, Signature, Checkpoint and Recovery, Data Error, Control Flow Error
統計
Statistics
本論文已被瀏覽 5708 次,被下載 16
The thesis/dissertation has been browsed 5708 times, has been downloaded 16 times.
中文摘要
對於即時系統來說,為了增加其系統的可靠度,短暫性錯誤的偵測及回復則成為一個重要的議題。在過去的所提出的方法中,為了偵測到控制程序錯誤,且回復控制程序錯誤所導致的資料錯誤,會在每個basic block檢查簽章,並設立還原點將資料做一次完整的備份。然而,頻繁的檢查並將資料完整地備份,這會迫使效能下降。另一方面,在發生錯誤後,也需花額外的時間,將系統返回到上一個還原點,並將備份的資料從記憶體中讀出,取代資料錯誤的暫存器。
因此,我們提出了一個新的方法,主要可分為三大方向。第一,透過事前分析,我們只需要針對有比較高機率會發生控制程序錯誤的basic block做關鍵簽章插入法,並設立還原點;第二,在設立還原點並備份暫存器資料時,只備份可能會因為控制程序錯誤而導致資料錯誤的暫存器;第三,當錯誤發生後,且在下次寫入新的值取代資料錯誤的暫存器之前,直接從checkpoint storage中讀取資料來運算。從實驗結果得知,此方法需額外加入的指令數較少而減少記憶體使用率最多可少2.4倍,且因為設立的還原點較少可大幅的提升執行效能最快可達11.9倍,而修正錯誤的所需的時間較短最多可節省20 cycle times,但偵測到的錯誤覆蓋率最差約少1.5%。
Abstract
For the real-time systems, the error detection and recovery of the transient fault have become an important issue to improve the reliability. In previous works, in order to detect the control flow error, and recovery the data error caused by control flow error (CFE), they will check signatures, and set a checkpoint to make a complete backup of the data in each basic block. However, frequent checks and data backups complete, which can reduce performance. On the other hand, after an error occurs, it’s not necessary to have performance overhead on recovery the data from checkpoint storage to registers.
The proposed technique has three main ideas. First, through sensitivity analysis result, it only does a critical signature assertion for the basic blocks which have a higher probability of the CFE. Second, in checkpoint phase, it only backs up the registers which might have the data corruption for a checkpoint. Third, in the recovery phase, it will read the checkpoint value from checkpoint storage for execution directly. In experiment results, the proposed technique has a significant decrease in memory overhead of additional instruction about 2.4 times and less performance overhead of additional execution about 11.9 times at most. It also shows the difference in error correction latency about 20 cycle times in the best case, but it has lower fault coverage about 1.5%.
目次 Table of Contents
論文審定書…………………………………………………………………….i
論文公開授權書 ......................................................................................ii
論文聲明書…………………………………………………………………....iii
致謝……………………………………………………………………………iv
中文摘要………………………………………………………………………vi
Abstract……………………………………………………………………….vii
Table of Contents ................................................................................. viii
List of Figures………………………………………………………………..x
List of Tables………………………………………………………………....xiii
Chapter 1. Introduction………………………………………………...1
1.1 Background..…………………………………………………….........1
1.2 Motivation………………………………………………………………3
1.3 Organization of the Thesis……………………………………………5
Chapter 2. Related Works……………………………………………..6
2.1 Fault Model of Control Flow Errors…………………………………6
2.2 Fault-Tolerant Approaches Overview………………………………7
2.3 Drawback of Previous Works………………………………………11
Chapter 3. Overview of Control Flow Error Recovery Technique..20
3.1 Critical Signature/Checkpoint/Recovery Assertion Technique.....22
3.1.1. Sensitivity Analysis…………………………………………….22
3.1.2. Critical Signature/Checkpoint/Recovery Assigner………….25
3.1.1. Convergence Testing………………………………………….28
3.2 On-the-fly Checkpoint and Recovery Technique…………………30
3.2.1. Performance Issue of Reducing Register Checkpoint and Recovery Policy………………………………………………………………………..30
3.2.2. On-the-fly Hardware Architecture……………………………32
Chapter 4. Experiment and Results………………………………..39
4.1 Experiment Set Up………………………………………………….39
4.2 Experiment Results…………………………………………………43
Chapter 5. Conclusion and Future Work…………………………..57
References………………………………………………………………...60
Appendix A. Critical Signature Assertion with Pure Software-based
Checkpoint and Recovery Technique…………………………………..63
Appendix B. Introduction of an IDE Environment...............................66
Appendix C. Implementation of CEDA…………………………………70
參考文獻 References
[1] AndesCore™ N801-S, available at:
http://www.andestech.com/product-details01.php?cls=3&id=4
[2] Andes Technology Corporation, available at: http://www.andestech.com
[3] GCC, the GNU Compiler Collection, available at: https://gcc.gnu.org/
[4] Z. Alkhalifa, V.S.S. Nair, N. Krishnamurthy and J.A. Abraham, “Design and Evaluation of System-Level Checks for On-Line Control Flow Error Detection,” in IEEE Transactions on Parallel and Distributed Systems, Vol. 10, Issue. 6, 1999.
[5] N. Oh, P.P. Shirvani and E.J. McCluskey, “Control-flow checking by software signatures,” in IEEE Transactions on Reliability, 2002.
[6] A. Li and B. Hong, “On-line control flow error detection using relationship signatures among basic blocks,” Journal of Computers and Electrical Engineering archive, pp. 132-141, January 2010.
[7] R. Vemu and J. Abraham, “CEDA: Control-Flow Error Detection Using Assertions,” in IEEE Transactions on Computers, 2011, pp. 1233-1245.
[8] C.El Salloum, A. Steininger, P. Tummeltshammer and W. Harter, “Recovery Mechanisms for Dual Core Architectures,” in International Symposium on Defect and Fault Tolerance in VLSI Systems, 2006, pp.380-388.
[9] M. Grosso, M.S. Reorda, M. Portela-Garcia, M. Garcia-Valderas, C. Lopez-Ongil and L. Entrena, “An On-line Fault Detection Technique Based on Embedded Debug Features,” in IEEE 16th International On-Line Testing Symposium, 2010.
[10] P. Bernardi, L.M. Bolzani Poehls, M. Grosso, and M. Sonza Reorda, “A Hybrid Approach for Detection and Correction of Transient Faults in SoCs,” in IEEE Transactions on Dependable and Secure Computing, 2010.
[11] J.R. Azambuja, A. Lapolli, L. Rosa and F.L. Kastensmidt, “Detecting SEEs in Microprocessors Through a Non-Intrusive Hybrid Technique,” in IEEE Transactions on Nuclear Science, Vol.58, No.3, June 2011.
[12] L. Parra, A. Lindoso, M. Portela, L. Entrena, F. Restrepo-Calle, S. Cuenca-Asensi and A. Martínez-Álvarez, “Efficient Mitigation of Data and Control Flow Errors in Microprocessors, ” in IEEE Transactions on Nuclear Science, Vol. 61, No. 4, 2014.
[13] R.G. Ragel and S. Parameswaran, “IMPRES: integrated monitoring for processor reliability and security,” in Design Automation Conference, 43rd ACM/IEEE, 2006.
[14] G. Nazarian, D.G. Rodrigues, A. Moreira, L. Carro and G.N. Gaydadjiev, “Bit-Flip Aware Control-Flow Error Detection, ” in 23rd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), 2015.
[15] G. Nazarian, R. Nane and G.N. Gaydadjiev, “Low-Cost Software Control-Flow Error Recovery, ” in Euromicro Conference on Digital System Design (DSD), 2015.
[16] L. Tan, Y. Tan and J. Xu, “CFEDR: Control-Flow Error Detection and Recovery Using Encoded Signatures Monitoring,” in IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS), 2013.
[17] R. Vemu, S. Gurumurthy and J.A. Abraham, “ACCE: Automatic Correction of Control-flow Errors,” in IEEE International Test Conference, 2007.
[18] E. Touloupis, J.A Flint and V.A Chouliaras and D.D. Ward, “A Fault-Tolerant Processor Core Architecture for Safety-Critical Automotive Applications, “ in SAE World Congress & Exhibition, 2005.
[19] M. Rebaudengo, M. S. Reorda, M. Torchiano and M. Violante, “Soft-error detection through software fault-tolerance techniques,” Proceeding of IEEE Int. Symp. Defect and Fault Tolerance in VLSI Systems, 1999, pp. 210–218.
[20] C. Bolchini, L. Pomante, F. Salice and D. Sciuto, “Reliable system specification for self-checking datapaths,” Proceeding of Conf. Design, Automation and Test in Europe, Washington, DC, IEEE Computer Society, 2005, pp. 1278–1283.
[21] E. Delano, “Checkpointing of register file,” U.S. Patent 6 941 489, Sep. 6, 2005.
[22] T. Li, R. Ragel, and S. Parameswaran, “Reli: Hardware/software checkpoint and recovery scheme for embedded processors,” in Design, Automation Test in Europe Conference Exhibition (DATE), March. 2012, pp. 875–880.
[23] T. Li, J.A. Ambrose and S. Parameswaran, “RECORD: Reducing Register Traffic for Checkpointing in Embedded Processors,” in Design, Automation & Test in Europe Conference & Exhibition (DATE), 2016.
[24] D. Kammler, J. Guan, G. Ascheid, R. Leupers and H. Meyr, ”A Fast and Flexible Platform for Fault Injection and Evaluation in Verilog-Based Simulations,” in Third IEEE International Conference on Secure Software Integration and Reliability Improvement, 2009.
[25] M.R. Guthaus, J.S. Ringenberg, D. Ernst, T.M. Austin, T. Mudge and R.B. Brown, “MiBench: A free, commercially representative embedded benchmark suite,” in IEEE International Workshop on Workload Characterization, 2001.
[26] F. Ayatolahi, B. Sangchoolie, R. Johansson and J. Karlsson, “A Study of the Impact of Single Bit-Flip and Double Bit-Flip Errors on Program Execution,” International Conference on Computer Safety, Reliability, and Security, SAFECOMP, Vol. 8153, 2013, pp. 265-276.
[27] H. Mushtaq, Z. Al-Ars and K. Bertels, “Survey of Fault Tolerance Techniques for Shared Memory Multicore/Multiprocessor Systems,” in IEEE 6th International Design and Test Workshop (IDT), 2011.
[28] J. Ohlsson, M. Rimen and U. Gunneflo, “A Study of the Effects of Transient Fault Injection into a 32-bit RISC with Built-in Watchdog,” Proceeding of 22nd Int’l Symp. Fault-Tolerant Computing (FTCS), 1992.
[29] M. Schuette and J. Shen, “Processor Control Flow Monitoring Using Signatured Instruction Streams,” in IEEE Trans. Comp., Vol. 6, No. 3, 1987.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code