Responsive image
博碩士論文 etd-1116112-142818 詳細資訊
Title page for etd-1116112-142818
論文名稱
Title
基於 ECL 之技術改善雲端抹除碼儲存系統效能
Performance Enhancement of the Erasure-Coded Storage Systems in Cloud Using the ECL-based Technique
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
71
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2012-11-01
繳交日期
Date of Submission
2012-11-16
關鍵字
Keywords
網絡頻寬、小塊寫問題、E-MBR編碼、奇偶更新延遲、雲存儲、數據恢復
network bandwidth, small-write problem, E-MBR codes, delayed parity update, data recovery, cloud storage
統計
Statistics
本論文已被瀏覽 5729 次,被下載 321
The thesis/dissertation has been browsed 5729 times, has been downloaded 321 times.
中文摘要
雖然 erasure codes 被廣泛的使用在儲存系統來提高容錯性,但是它卻存在著小塊寫問題。許多演算法被提出來改善RAID系統中的小塊寫效能,但是它們並不需要考慮網路頻寬使用量。然而,在雲端分散式儲存系統中,網路頻寬卻是相當珍貴的資源。在本篇論文中,我們提出一個基於E-MBR codes、Caching以及Logging方法 (ECL-based technique) 來改善小塊寫效能並且不使用額外的網路頻寬。除了改善小塊寫效能之外,相對於現存的演算法,ECL-based 技術也減少了奇偶更新和資料復原的延遲。
Abstract
Though erasure codes are widely adopted in high fault tolerance storage systems, there exists a serious small-write problem. Many algorithms are proposed to improve small-write performance in RAID systems, without considering the network bandwidth usage. However, the network bandwidth is expensive in cloud systems. In this thesis, we proposed an ECL-based (E-MBR codes, Caching and Logging-based) technique to improve the small-write performance without using extra network bandwidth. In addition, the ECL-based technique also reduces the delayed parity update and data recovery latency compared with the competing algorithm.
目次 Table of Contents
1. INTRODUCTION 1
1.1 Background 1
1.2 Motivation 4
1.3 Contributions 5
1.4 Organization of the Thesis 5
2. RELATED WORK 7
2.1 Erasure-Coded Storage Systems 7
2.2 Regenerating Codes 7
2.3 Technologies for Solving Small-Write Problem 9
3. THE PROPOSED METHOD 14
3.1 Problem Definition 14
3.2 System Architecture 15
3.3 Process Flow of Write Request 17
3.3.1 Write Request without Prior XOR 19
3.3.2 Write Request with Prior XOR 21
3.4 Cache Management 23
3.4.1. Cache Operation in WRwithoutPXOR 24
3.4.2. Cache Operation in WRwithPXOR 27
3.5 Log Management 30
3.6 Delayed Parity Update 31
3.7 Data Recovery 34
3.8 Data Integrity 35
4. ANALYSIS 37
4.1 Average Response Time 37
4.2 Delayed Parity Update Latency 40
4.3 Data Recovery Latency 42
4.4 System Reliability 45
5. SIMULATION 48
5.1 Simulation Environment 48
5.2 Simulation Results 48
5.2.1 Average Response Time 49
5.2.2 Delayed Parity Update Latency 51
5.2.3 Data Recovery Latency 52
5.2.4 Network Bandwidth Usage 54
6. CONCLUSIONS 57
REFERENCES 58
參考文獻 References
[1] Amazon EC2 service, http://aws.amazon.com/ec2/.
[2] OpenStack, http://openstack.org/.
[3] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, "The Eucalyptus Open-Source Cloud-Computing System," Proc. Symp. Cluster Computing and the Grid (CCGRID ’09), Shanghai, China, May 2009.
[4] B. Sotomayor, R. S. Montero, I.M. Llorente, and I. Foster, "Virtual Infrastructure Management in Private and Hybrid Clouds," IEEE Internet Computing, vol. 13, no. 5, pp.14-22, Sept.-Oct. 2009.
[5] The Nimbus project, http://www.nimbusproject.org/.
[6] K. Shvachko, H. Kuang, S. Radia, and R. Chansler, "The Hadoop Distributed File System," IEEE 26th Symp. Mass Storage Systems and Technologies (MSST ’10), pp.1-10, 3-7 May 2010.
[7] Amazon S3, http://aws.amazon.com/s3/.
[8] Amazon EBS service, http://aws.amazon.com/ebs/.
[9] H. Weatherspoon and J. D. Kubiatowicz, “Erasure Coding vs. Replication: A Quantitative Comparison,” Proc. International workshop on Peer-To-Peer Systems (IPTPS ’02), 2002.
[10] P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson, “RAID: High-Performance, Reliable Secondary Storage,” ACM Computing Surveys, vol. 26, no. 2, pp. 145-185, June 1994.
[11] C. Carlane and A. Osuna, “IBM System Storage N Series Implementation of RAID Double Parity for Data Protection,” IBM Redpaper REDP-4169-00, http://www.redbooks.ibm.com/redpapers/pdfs/redp4169.pdf, Apr. 2006.
[12] J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer, C. Wells, and B. Zhao, “OceanStore: An Architecture for Global-Scale Persistent Storage,” Proc. Int’l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS ’00), pp. 145-185, Nov. 2000.
[13] A. Haeberlen, A. Mislove, and P. Druschel, “Glacier: Highly Durable, Decentralized Storage Despite Massive Correlated Failures,” Proc. Conf. Networked Systems Design & Implementation (NSDI ’05), pp.143-158, 02-04 May 2005.
[14] G.R. Goodson, J.J. Wylie, G.R. Ganger, and M.K. Reiter, “Efficient Byzantine-Tolerant Erasure-Coded Storage,” Proc. Conf. Dependable Systems and Networks (DSN ’04), pp. 135-144, 28 June-01 July 2004.
[15] J. Hendricks, G. R. Ganger, and M. K. Reiter, “Low-Overhead Byzantine Fault-Tolerant Storage,” Proc. ACM Symp. Operating Systems Principles (SOSP ’07), Oct. 2007.
[16] H. Xia and A. A. Chien, “RobuSTore: A Distributed Storage Architecture with Robust and High Performance,” Proc. Conf. Supercomputing (SC ’07), Nov. 2007.
[17] Cleversafe, Inc., “Cleversafe Dispersed Storage,” Open Source Code Distribution, http://www.cleversafe.org/, 2009.
[18] R. Bhagwan, K. Tati, Y.-C. Cheng, S. Savage, and G. M. Voelker, “Total recall: System Support for Automated Availability Management,” Proc. Symp. Networked Systems Design and Implementation (NSDI ’04), Mar. 2004.
[19] K. Rashmi, N. B. Shah, and P. V. Kumar, “Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction,” IEEE Trans. on Information Theory, vol. 57, no. 8, pp.5227-5239, Aug. 2011.
[20] R. E. Bryant, “Data Intensive Supercomputing: The Case for DISC,” Technical Report CMU-CS-07-128, Carnegie Mellon University, May 2007.
[21] C. Jin, D. Feng, H. Jiang, L. Tian, J. Liu, and X. Ge, “TRIP: Temporal Redundancy Integrated Performance Booster for Parity-Based RAID Storage Systems,” IEEE 16th Conf. Parallel and Distributed Systems (ICPADS ’10), pp.205-212, 8-10 Dec. 2010.
[22] A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran, “Network Coding for Distributed Storage Systems,” IEEE Trans. on Information Theory, vol.56, no.9, pp.4539-4551, Sept. 2010.
[23] D. Cullina, A. G. Dimakis, and T. Ho, “Searching for Minimum Storage Regenerating Codes,” Proc. 47th Ann. Allerton Conf. Communication, Control, and Computing, Urbana-Champaign, Sep. 2009.
[24] C. Suh and K. Ramchandran, “Exact regeneration Codes for Distributed Storage Repair Using Interference Alignment,” Proc. IEEE Int. Symp. Information Theory, Jun. 2010. [Online]. Available: http://arxiv.org/abs/1001.0107v2.
[25] N. B. Shah, K. V. Rashmi, P. V. Kumar, and K. Ramchandran, “Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions,” IEEE Trans. on Information Theory, vol.58, no.4, pp.2134-2158, Apr. 2012.
[26] K. V. Rashmi, N. B. Shah, P. V. Kumar, and K. Ramchandran, “Explicit Construction of Optimal Exact Regenerating Codes for Distributed Storage,” Proc. 47th Ann. Allerton Conf. Communication, Control, and Computing, Urbana-Champaign, pp.1243-1249, 30 Sept.-02 Oct. 2009.
[27] D. Stodolsky, G. Gibson, and M. Holland, “Parity logging: Overcoming the small write problem in redundant disk arrays,” Proc. IEEE 20th Ann. International Symposium on Computer Architecture (ISCA ’93), pp. 64-75, May 1993.
[28] Z. Liu, X. Meng, and L. Xu, “A File Level RAID in Blue Whale File System,” IEEE 13th Int. Conf. High Performance Computing and Communications (HPCC ’11), pp.563-568, 2-4 Sept. 2011.
[29] W. Na, X. Meng, C. Si, J. Ke, X. Zhu, Q. Bu, and L. Xu, “A Novel Network RAID Architecture with Out-of-band Virtualization and Redundant Management,” IEEE 14th Int. Conf. Parallel and Distributed Systems (ICPADS '08), pp.105-112, 8-10 Dec. 2008.
[30] J. S. Plank, J. Luo, C. D. Schuman, L. Xu and Z. Wilcox-O'Hearn, “A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage,” Proc. 7th Conf. File and storage technologies (FAST '09), pp.253-265, San Francisco, California, 24-27 Feb. 2009.
[31] Allmydata, "Unlimited Online Backup, Storage, and Sharing," http://allmydata.com, 2008.
[32] B. Zhu, K. Li and H. Patterson, "Avoiding the Disk Bottleneck in the Data Domain Deduplication File System," Proc. 6th USENIX Conf. File and Storage Technologies (FAST '08), pp. 269-282, San Jose, Feb. 2008.
[33] B. Nisbet, "FAS Storage Systems: Laying the Foundation for Application Availability," Network Appliance white paper: http://www.netapp.com/us/library/analyst-reports/ar1056.html, Feb. 2008.
[34] B. Welch, M Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. Zelenka and B. Zhou, "Scalable Performance of the Panasas Parallel File System," Proc. 6th USENIX Conf. File and Storage Technologies (FAST '08), pp.17-33, San Jose, Feb. 2008.
[35] OLTP Application I/O. UMass Trace Repository. http://traces.cs.umass.edu/index.php/storage/storage.
[36] Dropbox, https://www.dropbox.com/.
[37] Microsoft SkyDrive, http://windows.microsoft.com/skydrive/.
[38] Google Drive, https://www.google.com/intl/zh_TW/drive/start/index.html.
[39] H. Y. Lin and W. G. Tzeng, "A Secure Erasure Code-Based Cloud Storage System with Secure Data Forwarding," IEEE Trans. on Parallel and Distributed Systems (TPDS ’12), vol.23, no.6, pp.995-1003, June 2012.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code