博碩士論文 etd-0910107-111129 詳細資訊


[回到前頁查詢結果 | 重新搜尋]

姓名 裴善成(Shan-cheng Pei) 電子郵件信箱 E-mail 資料不公開
畢業系所 電機工程學系研究所(Electrical Engineering)
畢業學位 碩士(Master) 畢業時期 95學年第2學期
論文名稱(中) 應用強化式學習建構模糊類神經控制系統
論文名稱(英) Constructing Neuro-Fuzzy Control Systems Based on Reinforcement Learning Scheme
檔案
  • etd-0910107-111129.pdf
  • 本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
    請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
    論文使用權限

    電子論文:校內校外完全公開

    論文語文/頁數 中文/43
    統計 本論文已被瀏覽 5105 次,被下載 4356 次
    摘要(中) 早期的模糊控制器依賴專家知識建立規則庫,無法使用輸出入的訓練資料,這是因為受控體反應有延遲現象的緣故。本論文提出一種新的模糊控制器設計方法,以強化學習演算法建立模糊控制器,目的是為了從延遲的反應中發掘可以達控制目標的最佳控制訊號順序。此系統使用一延時類神經網路來預測可能的控制效果,在學習中也同時增加新的模糊規則。Q-learning 網路和模糊控制器網路也同時使用梯降法修正。實驗結果顯示所得之模糊規則可有效進行控制。
    摘要(英) Traditionally, the fuzzy rules for a fuzzy controller are provided by experts. They cannot be trained from a set of input-output training examples because the correct response of the plant being controlled is delayed and cannot be obtained immediately. In this paper, we propose a novel approach to construct fuzzy rules for a fuzzy controller based on reinforcement learning. Our task is to learn from the delayed reward to choose sequences of actions that result in the best control. A neural network with delays is used to model the evaluation function Q. Fuzzy rules are constructed and added as the learning proceeds. Both the weights of the Q-learning network and the parameters of the fuzzy rules are tuned by gradient descent. Experimental results have shown that the fuzzy rules
    obtained perform effectively for control.
    關鍵字(中)
  • 模糊類神經
  • 強化式學習
  • 控制系統
  • 關鍵字(英)
  • neuro-fuzzy
  • reinforcement learning
  • control system
  • 論文目次 摘要 i
    Abstract ii
    目錄 iii
    第一章 簡介 - 1 -
    第二章 文獻探討 - 6 -
    2.1 強化式學習 - 6 -
    2.2 模糊類神經系統 - 11 -
    2.3 應用強化式學習於控制系統 - 16 -
    第三章 研究方法 - 19 -
    3.1 系統架構 - 19 -
    3.1.1 FIS - 19 -
    3.1.2 QN - 20 -
    3.1.3 SM - 21 -
    3.1.4 QE: - 22 -
    3.2 訓練方法 - 22 -
    3.2.1 訓練價值網路 - 22 -
    3.2.2 控制器的訓練 - 23 -
    3.2.3 規則數量的調整 - 24 -
    3.3 初始化 - 25 -
    3.4 系統流程 - 27 -
    第四章 實驗與結果討論 - 28 -
    4.1 線性系統 - 28 -
    4.2 非線性系統 - 33 -
    4.3 實驗討論 - 34 -
    第五章 結論與未來展望 - 35 -
    5.1 結論 - 35 -
    5.2 未來展望 - 35 -
    第六章 參考文獻 - 37 -
    參考文獻 [1] K. J. Astrm and T. Hgglund, Automatic tuning of PID controllers. Research Triangle Park, NC : Instrument Society of America, 1988.
    [2] C.-T.Lin and C.S.G.Lee, “Reinforcement structure/parameter learning for neural-networkbased fuzzy logic control systems,” IEEE Transaction Fuzzy System, vol. 2, no. 1, pp. 46–63, Feb. 1994.
    [3] X. Dai, C.-K. Li, and A. B. Rad, “An approach to tune fuzzy controllers based on reinforcement learning for autonomous vehicle control,” IEEE Transactions on Intelligent Transportation Systems, vol. 6, no. 3, pp.285–293, Sept. 2005.
    [4] M. J. Er and C. Deng, “Online tuning of fuzzy inference systems using dynamic fuzzy q-learning,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 34, no. 3, pp. 1478–1489, Jun.2004.
    [5] K.-S. Hwang and H.-J. Chao, “Adaptive reinforcement learning system for linearization control,” IEEE Transaction on Industrial Electronics, vol. 47, no. 5, pp.1185–1188, Oct. 2000.
    [6] L. Jouffe, “Fuzzy inference system learning by reinforcement methods,” IEEE Transactions on Systems,Man, and Cybernetics, Part C: Applications and Reviews,vol. 28, no. 3, pp. 338–355, Aug. 1998.
    [7] C.-F. Juang, “Combination of online clustering and qvalue based ga for reinforcement fuzzy system design,”IEEE Transactions on Fuzzy Systems, vol. 13, no. 3, pp.289–302, Jun. 2005.
    [8] C.-T. Lin, Neural fuzzy control systems with structure and parameter learning. Singapore ; River Edge, NJ : World Scientific, 1994.
    [9] A. G. B. Richard S. Sutton and R.J.Williams, “Reinforcement learning is direct adaptive optimal control,”IEEE Control System Magzine, vol. 12, no. 2, pp. 19–22, Apr. 1992.
    [10] P. S. Shimon Whiteson, Matthew E. Taylor, “Empirical studies in action selection with reinforcement learning,” Adaptive Behavior, vol. 15, no. 1, 2007.
    [11] R. S. Sutton, “Learning to predict by the methods of temporal differences,” Machine Learning, vol. 3, no. 1,pp. 9–44, 1988.
    [12] R. S. Sutton and A. G. Barto, Reinforcement learning : an introduction. Cambridge, Mass.: MIT Press, 1998.
    [13] K.-S. H. S.-W. Tan and M.-C. Tsai, “Reinforcement learning to adaptive control of nonlinear systems,”IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 33, no. 3, pp. 514–521, Jun.2003.
    [14] L.-X. Wang, “Stable adaptive fuzzy control of nonlinear systems,” IEEE Transactions on Fuzzy Systems, vol. 1, no. 2, pp. 146–155, May 1993.
    [15] L.-X. Wang,, A course in fuzzy systems and control. Upper Saddle River, N.J. : Prentice Hall PTR, 1997.
    口試委員
  • 謝朝和 - 召集委員
  • 吳志宏 - 委員
  • 歐陽振森 - 委員
  • 潘欣泰 - 委員
  • 李錫智 - 指導教授
  • 口試日期 2007-07-26 繳交日期 2007-09-10

    [回到前頁查詢結果 | 重新搜尋]


    如有任何問題請與論文審查小組聯繫