Responsive image
博碩士論文 etd-0131111-110417 詳細資訊
Title page for etd-0131111-110417
論文名稱
Title
設計與實作一個具安全性的Hadoop雲端計算結構
Design and implementation of a Hadoop-based secure cloud computing architecture
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
68
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2010-09-30
繳交日期
Date of Submission
2011-01-31
關鍵字
Keywords
架構設計、雲端運算
IPSec, SOCKS Authorization, Architecture Design, Hadoop, cloud computing
統計
Statistics
本論文已被瀏覽 5651 次,被下載 63
The thesis/dissertation has been browsed 5651 times, has been downloaded 63 times.
中文摘要
本研究的主軸在於設計和實作一個安全的Hadoop 叢集。所謂的雲端運算也就是網路運算,大部份的資料都是藉由網路傳遞。為了提出一個具安全性的結構勢必要先對使用者進行相對應的認證,並保證使用者的資料在傳輸時不會被任意的竊取或竄改,或者是即使資料被竊取了也無法被得知其真實內容。因此研究分為下列幾個重點:
(一) 認證(Authorization):在本研究中我們首先分析目前Hadoop 所存在的使用者認證問題,並針對問題提出了兩種解決方法:SOCKS Authorization,用來管理Hadoop叢集對外部服務的認證,並使用username/password 確認使用者身分。Service Level Authorization 為Hadoop 0.20 開始所追加的機制,用來管理使用者使用Hadoop Service 的權限。
(二) 傳輸加密:在Hadoop 傳輸過程中,我們分析Hadoop 實際傳輸的情況並整理出可能的安全問題,以避免重要資訊(Block ID、Jod ID、username)暴露於不信任的網路中。我們使用IPSec 來實作Hadoop 傳輸加密及封包的驗證。
(三) 架構設計:基於上述兩點與Hadoop 的設計理念,我們提出一個安全的Hadoop 叢集架構,以解決目前Hadoop 所存在的安全問題。最後,針對此結構加以分析HDFS和MapReduce 的運作效能。
Abstract
The goal in this research is to design and implement a secure Hadoop cluster. The
cloud computing is a type of network computing, where most data is transmitted through
network. To develop a secure cloud architecture, we need to validate users first, and
guarantee transmitting data against stealing and falsification. In case of someone steals the
data, he is still hard to know content. Therefore, we focus on the following points:
I. Authorization: First, we investigate the user authorization problem in Hadoop
system, and then, propose two solutions: SOCKS Authorization and Service Level
Authorization. SOCKS Authorization is a external authorization in Hadoop System,
and uses username/password to identify users. Service Level Authorization is a new
authorization mechanism in Hadoop 0.20. This mechanism to ensure clients connecting
to a particular Hadoop service have the necessary, pre-configured, permissions and are
authorized to access the given service.
II. Transmission Encryption: To keep important data, such as Block ID, Job ID,
username, etc, away from exposedness in non-trusted networks, we examine Hadoop
transmissions in practice, and point out possible security problems. Subsequently, we
use IPSec to implement transmission encryption and packet verification for Hadoop.
III. Architecture Design: Based on the implementation framework of Hadoop mentioned
above, we propose a secure architecture of Hadoop cluster to solve the security
problems. In addition, we also evaluate the performances of HDFS and MapRduce in
this architecture.
目次 Table of Contents
目錄
致謝............................................................................................................................................ i
中文摘要 .................................................................................................................................. ii
Abstract .................................................................................................................................. iii
目錄.......................................................................................................................................... iv
圖目錄 .................................................................................................................................... vii
表目錄 ..................................................................................................................................... X
第1 章 序論 ............................................................................................................................. 1
1.1 研究背景與動機 ............................................................................................................ 1
1.2 研究目的 ........................................................................................................................ 1
1.3 論文結構 ........................................................................................................................ 2
第2 章 相關技術說明 ............................................................................................................. 3
2.1 雲端運算簡介 ................................................................................................................ 3
2.2 Hadoop ............................................................................................................................ 4
2.2.1 特色 ...................................................................................................................... 5
2.2.2 Hadoop common ................................................................................................... 6
2.2.3 HDFS .................................................................................................................... 6
2.2.3.1 簡介 .......................................................................................................... 6
2.2.3.2 設計目標 .................................................................................................. 7
2.2.3.3 運作原理 .................................................................................................. 8
2.2.4 MapReduce ......................................................................................................... 11
2.2.4.1 簡介 ........................................................................................................ 11
v
2.2.4.2 Job 運作流程 ........................................................................................... 12
2.3 SOCKS Protocol ........................................................................................................... 13
2.4 IPSec ............................................................................................................................. 14
2.4.1 運作流程 ............................................................................................................ 15
2.4.2 工作模式 ............................................................................................................ 16
2.4.3 金鑰管理 ............................................................................................................ 18
第3 章 問題分析與相關研究 ............................................................................................... 19
3.1 安全問題分析 .............................................................................................................. 19
3.1.1 Hadoop 使用者認證分析 ................................................................................... 19
3.1.2 Hadoop 連線與封包分析 ................................................................................... 21
3.1.3 安全問題總結 .................................................................................................... 23
3.2 相關研究分析與說明 .................................................................................................. 24
3.2.1 Through a Gateway ............................................................................................. 24
3.2.2 HDFS Proxy ........................................................................................................ 25
3.2.3 相關研究總結 .................................................................................................... 27
第4 章 研究方法與成果 ....................................................................................................... 29
4.1 問題解決方法 .............................................................................................................. 29
4.1.1 SOCKS Authorization ......................................................................................... 31
4.1.2 Service Level Authorization ................................................................................ 31
4.1.3 IPSec 傳輸加密 .................................................................................................. 32
4.2 系統架構設計與實作 .................................................................................................. 33
4.2.1 實驗環境 ............................................................................................................ 33
4.2.2 需求與限制 ........................................................................................................ 34
4.2.3 系統架構 ............................................................................................................ 35
vi
4.2.3.1 Hadoop 安全結構 .................................................................................... 35
4.2.3.2 雲端整體架構 ........................................................................................ 37
4.2.4 實驗結果與效能分析 ........................................................................................ 37
4.2.4.1 實驗結果 ................................................................................................ 37
4.2.4.2 效能分析 ................................................................................................ 43
4.2.4.3 其他相關測試 ........................................................................................ 49
第5 章 結論 ........................................................................................................................... 52
第6 章 參考文獻 ................................................................................................................... 54
參考文獻 References
參考文獻
[1] Peter Mell, Tim Grance, 2009, The NIST Definition of Cloud Computing , National
Institute of Standards and Technology, Information Technology Laboratory.
[2] Yao-Tsung Wang, 2009, The Trrend off Clloud Computtiing , CloudIntro.pdf, NCHC
Cloud Computing Research Group.
[3] Yao-Tsung Wang, 2009, The Trrend off Clloud Computtiing , HadoopIntro.pdf, NCHC
Cloud Computing Research Group.
[4] Tom White, 2009, Hadoop The Definitive Guide , p.4, OReilly.
[5] Aaron Kimball, 2009,The Project Split, projsplit2.pdf, Cloudera.
[6] Yao-Tsung Wang, 2009, The Trrend off Clloud Computtiing , HadoopIntro.pdf, NCHC
Cloud Computing Research Group.
[7] Welcome to Hadoop Common!, 2010, Welcome to Hadoop Common!, The Apache
Software Foundation, Available at: http://hadoop.apache.org/common/, Accessed 7
September 2010.
[8] HDFS Architecture Guide, 2010, HDFS Architecture Guide, The Apache Software
Foundation, Available at: http://hadoop.apache.org/hdfs/docs/r0.21.0/hdfs_design.html,
Accessed 8 September 2010.
[9] Tom White, 2009, Hadoop The Definitive Guide , p.42-43, OReilly.
[10] Tom White, 2009, Hadoop The Definitive Guide , p.63-64, OReilly.
[11] Tom White, 2009, Hadoop The Definitive Guide , p.66-67, OReilly.
[12] MapReduce Tutorial, 2010, MapReduce Tutorial, The Apache Software Foundation,
Available at: http://hadoop.apache.org/mapreduce/docs/r0.21.0/mapred_tutorial.html,
Accessed 9 September 2010.
[13] Tom White, 2009, Hadoop The Definitive Guide , p.153-155, OReilly
[14] Wikipedia, 2010, SOCKS, Wikipedia the free encyclopedia, Available at: http://hadoop.
apache.org/mapreduce/docs/r0.21.0/mapred_tutorial.html,Accessed 13 September 2010.
56
[15] B. Scott Wilson, What is SOCKS?, what_is_socks.pdf, IBM Global Services, Network
Services.
[16] William Stallings, 2005, Cryptography and Network Security: Principles and Practice
4/e, chap16, Prentice Hall.
[17] 陳勇勳, 2009, 更安全的Linux 網絡, p.353, 電子工業出版社。
[18] Owen O’Malley, Kan Zhang, Sanjay Radia,Ram Marti, and Christopher Harrell, 2009,
Hadoop Security Design, security-design.pdf, Yahoo!
[19] Aaron Kimball, 2008, Securing a Hadoop Cluster Through a Gateway, Cloudera,
Available at: http://www.cloudera.com/blog/2008/12/securing-a-hadoop-cluster-through-agateway/,
Accessed 16 September 2010.
[20] The Apache Software Foundation, 2009, HDFS Proxy Guide, hdfsproxy.pdf, The
Apache Software Foundation.
[21] Service Level Authorization Guide, 2010, Service Level Authorization Guide, The
Apache Software Foundation, Available at: http://hadoop.apache.org/common/docs/r0.21.0
/service_level_auth.html, Accessed 18 September 2010.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內公開,校外永不公開 restricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 3.17.181.21
論文開放下載的時間是 校外不公開

Your IP address is 3.17.181.21
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code