Responsive image
博碩士論文 etd-0708111-170155 詳細資訊
Title page for etd-0708111-170155
論文名稱
Title
執行於分散式檔案系統上之關聯式資料庫的效能分析
Performance Analysis of Relational Database over Distributed File Systems
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
73
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2011-05-27
繳交日期
Date of Submission
2011-07-08
關鍵字
Keywords
雲端計算、分散式檔案系統、關聯式資料庫、效能分析、介面
distributed file system, Cloud Computing, relational database, interface, performance analysis
統計
Statistics
本論文已被瀏覽 5625 次,被下載 0
The thesis/dissertation has been browsed 5625 times, has been downloaded 0 times.
中文摘要
隨著網路的發展,人們對網路的依賴越來越深,許多桌上型應用程式慢慢轉移到網路環境上的應用程式,其中包括文書處理、行事曆、相簿管理、甚至是應用程式開發都可以在網路上面進行。Google正是一個提供網路服務的網路公司,快速的搜尋服務以及之後發展的電子郵件信箱服務都被大眾所稱讚,他們靠的是快速的反應時間,大量的資料儲存空間來吸引許多使用者,並且提供其它廠商刊登廣告來賺取收費;另外一個知名的社群網站Facebook,也是眾所皆知大型網站之一,能夠即時的處理龐大的社交訊息,處理不同使用者之間關連性的推薦,如此龐大的資料收集及運算靠的就是當今最火紅的技術「雲端計算」。
雲端計算使用分散式儲存空間,與分散式運算技術,來達到大量儲存空間,以及快速資料分析處理反應,由於這是個新的技術,目前看到的應用是有限的,雲端分散式檔案系統適合於批次資料處理,以及Write-once-Read-many的應用,例如:YouTube、字串搜尋分析、Log 檔案分析等等;Hadoop分散式檔案系統(HDFS)便是一個著名的雲端技術,而目前一般應用是以傳統之關聯式資料庫為主,有鑑於此,本論文將探討雲端分散式檔案系統下之資料庫服務技術,將分析以FUSE-DFS為介面之關聯式資料庫之效能。如能調校或修改FUSE-DFS,使關聯式資料庫之執行效能符合應用程式之需求,則許多傳統之資料庫應用就能非常容易搬遷至雲端平台上來運作,對於雲端平台之推廣使用,是相當有助益的。
Abstract
With the growing of Internet, people use network frequently. Many PC applications have moved to the network-based environment, such as text processing, calendar, photo management, and even users can develop applications on the network. Google is a company providing web services. Its popular services are search engine and Gmail which attract people by short response time and lots amount of data storage. It also charges businesses to place their own advertisements. Another hot social network is Facebook which is also a popular website. It processes huge instant messages and social relationships between users. The magic power of doing this depends on the new technique, Cloud Computing.
Cloud computing has ability to keep high-performance processing and short response time, and its kernel components are distributed data storage and distributed data processing. Hadoop is a famous open source to build cloud distributed file system and distributed data analysis. Hadoop is suitable for batch applications and write-once-and-read-many applications. Thus, currently there are only fewer applications, such as pattern searching and log file analysis, to be implemented over Hadoop. However, almost all database applications are still using relational databases. To port them into cloud platform, it becomes necessary to let relational database running over HDFS. So we will test the solution of FUSE-DFS which is an interface to mount HDFS into a system and is used like a local filesystem. If we can make FUSE-DFS performance satisfy user’s application, then we can easier persuade people to port their application into cloud platform with least overhead.
目次 Table of Contents
致謝 I
摘要 II
Abstract III
目錄 IV
圖次 VI
表次 VII
1. 序論 1
1.1. 研究背景 1
1.2. 研究動機與目的 2
1.3. 研究方法與步驟 3
1.3.1. 研究方法 3
1.3.2. 研究步驟 4
1.4. 章節介紹 5
2. 檔案系統介紹 6
2.1. 雲端計算簡介 6
2.2. Hadoop簡介 7
2.3. HDFS介紹 10
2.3.1. HDFS的設計 10
2.3.2. HDFS的概念 13
2.3.3. HDFS檔案型態 16
2.4. Lustre File System介紹 18
2.5. Network File System(NFS)簡介 22
2.6. Global File System(GFS)簡介 24
3. FUSE-DFS 27
3.1. FUSE簡介 27
3.2. libhdfs library 29
3.3. FUSE-DFS介紹 30
4. 系統設計與實作說明 33
4.1. 軟體介紹 33
4.2. 系統架構設計 34
4.3. 系統實作說明 35
4.3.1. Hadoop叢集環境參數設置 35
4.3.2. FUSE-DFS與HDFS之間的溝通 37
4.3.3. MySQL執行於FUSE-DFS上的技術瓶頸 42
4.3.4. MySQL執行於FUSE-DFS上的解決方法 44
4.3.5. MySQL執行於FUSE-DFS上的實作 46
4.3.6. MySQL執行於FUSE-DFS上的效能分析與校正 50
5. MySQL執行於分散式檔案系統上的效能分析 55
6. 結論與未來展望 61
參考文獻 62
參考文獻 References
[1] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, “The Google File System”, 19th ACM Symposium on Operating Systems Principles, Lake George, NY, October, 2003.
[2] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, “Bigtable: A Distributed Storage System for Structure Data”, OSDI, 2006
[3] Dhruba Borthakur. “The Hadoop Distributed File System: Architecture and Design”, 2007
[4] Tom White. “Hadoop: The Definitive Guide”, O’Reilly Media, Inc., 2009
[5] Cloud Computing Wikipedia, http://en.wikipedia.org/wiki/Cloud_computing
[6] W. Richard Stevens, Bill Fenner, Andrew M. Rudoff, UNIX® Network Programming Volume 1, Third Edition: The Sockets Networking API, Addison Wesley, November 2003.
[7] Apache Hadoop Project, http://hadoop.apache.org/
[8] NCHC Cloud Computing Research Group, http://trac.nchc.org.tw/cloud/
[9] Hadoop Wiki, http://wiki.apache.org/hadoop/
[10] Hadoop Cluster Setup, http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html
[11] Hadoop API, http://hadoop.apache.org/common/docs/r0.20.2/api/index.html
[12] HDFS Architecture(version 0.20.2), http://hadoop.apache.org/common/docs/r0.20.2/hdfs_design.html
[13] Apache HBase Project, http://hbase.apache.org/
[14] Hbase Wiki, http://en.wikipedia.org/wiki/Hbase
[15] Hbase API, http://hbase.apache.org/docs/current/api/index.html
[16] File Appends in HDFS, http://www.cloudera.com/blog/2009/07/file-appends-in-hdfs/
[17] Java SDK Ed. 6 API, http://java.sun.com/javase/6/docs/api/
[18] Java SDK Wiki, http://en.wikipedia.org/wiki/Java_SDK
[19] Lustre, http://wiki.lustre.org/index.php/Main_Page
[20] Lustre Wiki, http://en.wikipedia.org/wiki/Lustre_(file_system)
[21] Lustre 1.8 Operations Manual, Lustre_1.8_man_v1.3, March 2010
[22] Red Hat Cluster Suite, http://www.redhat.com/cluster_suite/
[23] CentOS-5 Documentation, http://www.centos.org/docs/5/
[24] Global File System, http://www.centos.org/docs/5/html/5.2/Global_File_System/
[25] Global File System Wiki, http://en.wikipedia.org/wiki/Global_File_System
[26] Global Network Block Device, http://www.centos.org/docs/5/html/5.2/Global_Network_Block_Device/
[27] NFS Wiki, http://en.wikipedia.org/wiki/Network_File_System_(protocol)
[28] NFS, http://www.centos.org/docs/5/html/Deployment_Guide-en-US/ch-nfs.html
[29] FUSE, http://fuse.sourceforge.net/
[30] FUSE Wiki, http://en.wikipedia.org/wiki/Filesystem_in_Userspace
[31] Mounting HDFS, http://wiki.apache.org/hadoop/MountableHDFS
[32] Relational Database, http://en.wikipedia.org/wiki/Relational_database
[33] MySQL, http://www.mysql.com/
[34] MySQL Wiki, http://en.wikipedia.org/wiki/MySQL
[35] MySQL Data Files, http://articles.rootsmith.ca/linux/mysql-data-files-myd-myi-and-frm
[36] SQL Resource, http://www.sql.org/
[37] SQL Tutorial, http://www.1keydata.com/sql/sql.html
[38] libhdfs library(version 0.20.2), http://hadoop.apache.org/common/docs/r0.20.2/libhdfs.html
[39] libhdfs library Wiki, http://wiki.apache.org/hadoop/LibHDFS
[40] LAMP Software Bundle Wiki, http://en.wikipedia.org/wiki/LAMP_(software_bundle)
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外均不公開 not available
開放時間 Available:
校內 Campus:永不公開 not available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 18.191.174.168
論文開放下載的時間是 校外不公開

Your IP address is 18.191.174.168
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code