Responsive image
博碩士論文 etd-0217117-155910 詳細資訊
Title page for etd-0217117-155910
論文名稱
Title
加強式學習之經驗分享於分散式代理人之應用
Knowledge Sharing Approaches Based on Reinforcement Learning for Distributed Agents System
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
62
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2017-03-10
繳交日期
Date of Submission
2017-03-17
關鍵字
Keywords
分散式運算、蟻群演算法、經驗融合、經驗分享、增強式學習
Ant colony algorithm, Distributed computing, Reinforcement learning, Knowledge sharing, Knowledge merging
統計
Statistics
本論文已被瀏覽 5657 次,被下載 716
The thesis/dissertation has been browsed 5657 times, has been downloaded 716 times.
中文摘要
為消弭在大群的學習代理人進行經驗分享時,複雜混亂的知識交換行為,並可利用經驗分享來快速取得有用的環境訊息,增強個別學習代理人本身不足的學習經驗,本文提出一個雲端整合資訊的機制,其各個學習代理人僅與雲端伺服器溝通,以此去除複雜的知識交換行為;且能收集所有代理人的學習經驗並加以融合,接著分享給經驗不足的代理人。代理人利用蟻群演算法中,費洛蒙機制的概念,作為上傳自身經驗時評估經驗重要性的依據,此依據將化為權重值,並用於伺服器合併多代理人的學習經驗。為因應大量的學習經驗資料,雲端伺服器採用分散式儲存系統的資料儲存架構。而處理此海量的資料則採用Apache Hadoop 的軟體框架,其資料處理方式-MapReduce,為分散式運算架構,能快速且有效的處理大量資料。各個學習代理人會向雲端伺服器索取融合後的學習經驗,並將此融合經驗再次與自身的經驗互相整合,以達到經驗分享的目的。最後本論文以自製的小型伺服器,並在多台PC上模擬總數為360隻的學習代理人,以隨機散佈於環境中的方式,同時在相同的環境中學習以實作本文所提之方法,證明此方法能有效改善學習效果。
Abstract
Considering situations in a multi-agent system, if there are tremendous number of agents sharing knowledge with each other, it is complicated activities hard to be solved. This thesis proposed a method that all agents just connect with a server to alleviate the complexity of the experiences exchange activities. The server collects learning knowledge loaded from all the agents, merges the knowledge, and shares the knowledge to all agents which lack akin experiences. The agents utilized the proposed Pheromone Mechanism in Ant Colony Algorithm to evaluate whether an experience is worthy to upload to the server. The remained pheromone in the trace where states are visited along with becomes a weight for combining a collection of experiences on the server. Meanwhile, to deal with the problem of massive data processing, this thesis used the open source software-Apache Hadoop, along with the MapReduce programming model. The agents can take shared experiences integrated with their own knowledge to achieve knowledge sharing and increase the efficiency significantly. The proposed approach in this thesis was implemented by a homemade server and personal computers. The results of simulation with 360 learning agents demonstrate the performance of the proposed approach.
目次 Table of Contents
論文審定書 i
摘要 iii
Abstract iv
圖表目錄 ix
表格目錄 xi
第1章 導論 1
1.1 動機 1
1.2 論文架構 2
第2章 文獻探討 3
2.1 馬可夫決策過程 3
2.1.1 增強式學習法 4
2.1.2 Q-Learning 5
2.2 蟻群演算法 7
2.2.1蟻群演算法 7
2.2.2蟻群演算法之費洛蒙更新 8
2.3 分散式系統 10
2.3.1分散式儲存系統 11
2.3.2分散式運算 12
第3章 研究方法 14
3.1 建立複數學習代理人經驗分享機制 14
3.2使用蟻群理論設計加權函數 16
3.2.1費洛蒙機制 17
3.2.2加權函數 18
3.3 分散式系統應用 20
3.3.1資料儲存結構 20
3.3.2資料處理程序MapReduce與經驗融合法 22
3.4 整體流程與演算法 26
3.4.1個別學習-上傳模式 27
3.4.2經驗融合模式 29
3.4.3個別學習-下載模式 30
第4章 模擬實驗與實作結果 33
4.1 迷宮模擬實驗 33
4.2 實作結果 41
第4章 結論與未來展望 47
4.1 結論 47
4.2 未來展望 47
參考文獻 48
參考文獻 References
[1] A. V. Ivanov, and A. A. Petrovsky, “First-order Markov Property of The Auditory Spiking Neuron Model Response,” Signal Processing Conference, Florence, Italy, 4-8 Sept. 2006.
[2] K. I. Y. Inoto, H. Taguchi, and A. Gofuku, “A Study of Reinforcement Learning with Knowledge Sharing,” in Proc. of IEEE Int. Conf. on Robotics and Biomimetics, Okayama, Japan, pp. 175-179, Hong Kong, China, 22-26 Aug. 2004.
[3] Z. Jin, W. Y. Liu, and J. Jin, “State-Clusters Shared Cooperative Multi-Agent Reinforcement Learning,” Asian Control Conference ASCC, pp. 129-135, 27-29 Aug. 2009.
[4] M. N. Ahmadabadi, and M. Asadpour, “Ecpertness Based Cooperative Q-Learning,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 32, no. 1, pp. 1083-1094, Feb. 2002.
[5] B. N. Araabi, S. Mastoureshgh, and M. N. Ahmadabadi, “A Study on Expertise of Agents and Its Effects on Cooperative Q-Learning,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 37, no. 2, pp. 1083-1094, Apr. 2007.
[6] A. Anuntapat, A. Thammano, and O. Wongwirat, “Searching Optimization Route by Using Pareto Solution with Ant Algorithm for Mobile Robot in Rough Terrain Environment,” Control, Automation, Robotics and Vision (ICARCV), International Conference, Phuket, Thailand, 13-15 Nov. 2016.
[7] J. Li, J. Cheng, Y. Zhao, F. Yang, Y. Huang, H. Chen, and R. Zhao, “A Comparison of General-Purpose Distributed Systems for Data Processing,” Big Data IEEE International Conference, pp. 378-383, Washington D.C., USA, 5-8 Dec. 2016.
[8] K. Ito, A. Gofuku, Y. Imoto, and M. Takeshita, “A study of reinforcement learning with knowledge sharing for distributed autonomous system,” Computational Intelligence in Robotics and Automation, Proceedings IEEE, pp. 1120-1125, Kobe, Japan, 16-20 July. 2003.
[9] J. Pinto, P. Jain, and T. Kumar, “Hadoop distributed computing clusters for fault prediction,” Computer Science and Engineering Conference ICSEC, Chiang Mai, Thailand, 14-17 Dec. 2016.
[10] T. Tateyama, S. Kawata, and Y. Shimomura, “Parallel Reinforcement Learning Systems using Exploration Agents and Dyna-Q Algorithm,” in Proc. SICE Annu. Conf., Takamatsu, Japan, pp. 2774-2778, Takamatsu, Japan, 17-20 Sept. 2007.
[11] M. Hussin, Y. C. Lee, and A. Y. Zomaya, “Efficient Energy Management using Adaptive Reinforcement Learning-based Scheduling in Large-Scale Distributed Systems,” in International Conf. on Parallel Proc., Sydney, Australia, pp. 385-393, Taipei City, Taiwan, 13-16 Sept. 2011.
[12] H. Karaoğuz, and H. Bozma, “Merging Appearance-Based Spatial Knowledge in Multirobot Systems,” Intelligent Robots and Systems (IROS), IEEE/RSJ International Conference, pp. 5107-5112, Daejeon, Korea, 9-14 Oct. 2016.
[13] K.S. Hwang, W. C. Jiang, and Y. J. Chen, “Model Learning and Knowledge Sharing for a Multiagent System with Dyna-Q Learning,” IEEE Transactions on Cybernetics, vol. 45, no. 5, pp. 964-976, May. 2015.
[14] K.S. Hwang, W. C. Jiang, Y. J. Chen, and W. H. Wang, “Reinforcement Learning with Model Sharing for Multi-Agent Systems,” System Science and Engineering ICSSE, pp. 293-296, Budapest, Hungary, 4-6 July. 2013.
[15] A. Lazarowska, “Parameters Influence on the Performance of an Ant Algorithm for Safe Ship Trajectory Planning,” Cybernetics (CYBCONF), IEEE International Conference, Gdynia, Poland, 24-26 June. 2015.
[16] X. Huang, H. Zhou, and W. Wu, “Hadoop Job Scheduling Based on Mixed Ant-Genetic Algorithm,” Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), International Conference, Xi'an, China, 17-19 Sept. 2015.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code