國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,以類神經網路為架構之語音辨識系統 ,The Speech Recognition System using Neural Networks

論文名稱 Title	以類神經網路為架構之語音辨識系統 The Speech Recognition System using Neural Networks
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	90 學年度第 2 學期 The spring semester of Academic Year 90	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	78
研究生 Author	陳松琳 Sung-Lin Chen
指導教授 Advisor	陳遵立 Tzuen-Lih Chern
召集委員 Convenor	吳永春 Yung-Chun Wu
口試委員 Advisory Committee	吳英秦, 黃金請, 高一智 Yin-Chin Wu; Chin-Ching Huang; I-Chih Kao
口試日期 Date of Exam	2002-06-28	繳交日期 Date of Submission	2002-07-06
關鍵字 Keywords	倒傳遞演算法、類神經網路、語音辨識 neural network, speech recognition, backpropagation
統計 Statistics	本論文已被瀏覽 5721 次，被下載 4660 次 The thesis/dissertation has been browsed 5721 times, has been downloaded 4660 times.

中文摘要
本論文以倒傳遞類神經網路(BPNN : Backpropagation Neural Network)為架構設計一非特定語者之中文數字語音辨識系統，辨識率可達95%。當此系統應用於特定語者時，經適應修正後，更可使系統之辨識率高於99%。為能使系統轉移到數位處理器(DSP)平台，針對類神經網路模型，提出了神經元移除法則，利用此法則可減去約數量之神經元，降低系統20%∼40%的記憶體需求，且系統之辨識率仍可達85%。在BPNN網路模型的輸出架構中，提出以二進位編碼之方式取代傳統一對一之架構，以增加系統可辨字彙之數量。對於語音訊號端點偵測，也提出另一有效之搜尋法則，不論雜訊干擾存在與否，不需複雜之運算，可有效定位出有聲段所在之處。
Abstract
This paper describes an isolated-word and speaker-independent Mandarin digit speech recognition system based on Backpropagation Neural Networks(BPNN). The recognition rate will achieve up to 95%. When the system was applied to a new user with adaptive modification method, the recognition rate will be higher than 99%. In order to implement the speech recognition system on Digital Signal Processors (DSP) we use a neuron-cancellation rule in accordance with BPNN. The system will cancel about 1/3 neurons and reduce 20%∼40% memory size under the rule. However, the recognition rate can still achiever up to 85%. For the output structure of the BPNN, we present a binary-code to supersede the one-to-one model. In addition, we use a new ideal about endpoint detection algorithm for the recoding signals. It can avoid disturbance without complex computations.

目次 Table of Contents
第一章　緒論 1.1 前言 1.2 研究背景 1.3 研究動機與目標第二章　分析框處理 2.1 簡介 2.2 最常用的兩種分析框 2.3 固定寬度與變動寬度之分析框第三章　端點偵測演算法 3.1 簡介 3.2 時域端點偵測法相關參數簡介 3.3 端點偵測法第四章　特徵參數擷取 4.1 簡介 4.2 前置強波處理 4.3 倒頻譜參數 4.4 頻譜參數與倒頻譜參數之比較第五章　動態時間校準演算法 5.1 簡介 5.2 動態時間校準法 5.3 語音樣板資料庫的建立第六章　倒傳遞類神經網路模型 6.1 簡介 6.2 多層感知機之架構 6.3 倒傳遞演算法 6.4 辨識系統訓練方法 6.5 神經元移除法則第七章　實驗方法 7.1 前言 7.2 語音樣本資料庫 7.3 實驗設計第八章　實驗結果 8.1 端點偵測演算法之比較結果 8.2 特徵參數的選定 8.3 學習調整率對系統學習效能的影響 8.4 非特定語者之中文數字辨識系統 8.5 倒傳遞類神經網路輸出架構之設計 8.6 神經元移除法則的應用 8.7 辨識演算法DTW與BPNN之比較 8.8 非特定語者之系統應用於特定語者之影響第九章　結論與展望 9.1 結論 9.2 展望參考文獻

參考文獻 References
[1] X.D. Huang and K.F. Lee, “On Speaker-Independent, Speaker-Dependent, and Speaker-Adaptive Speech Recognition,” IEEE Trans on ASSP,1991 [2] P. Woodland, “Speech Recognition,” IEE, 1998. [3] H. Sakoe and S. Chiba, “Dynamic Programming Optimization for Spoken Word Recognition,” IEEE Trans on ASSP, Vol.26, pp 43-49, Feb. 1978. [4] C. Myers and L.R. Rabiner, “Performance Tradeoffs in Dynamic Time Warping Algorithms for Isolated Word Recognition,” IEEE Trans on ASSP, Vol.28, No.6, pp 623-635, Dec. 1980. [5] D.P. Morgan and C.L. Scofield, Neural Networks and Speech Processing, Kluwer Academic, 1991. [6] L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” IEEE Trans on ASSP, Vol.77, No.2, pp 257-286, Feb. 1989. [7] L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Prentice Hall, pp 200-232, 1993. [8] F. Runstein and F. Violaro, “An Isolated-Word Speech Recognition System Using Neural Networks,” IEEE Trans on ASSP, pp 550-553, 1996. [9] G.D. Wu and C.T. Lin, “A Recurrent Neural Fuzzy Network for Word Boundary Detection in Variable Noise-Level Environments,” IEEE Trans on System, Man, and Cybernetics, Vol. 31, No. 1, pp 84-97, Feb. 2001. [10] L.R. Rabiner and R.W. Schafer, Digital Processing of Speech Signals, Prentice Hall, 1978. [11] A. Hussain, S.A. Samad and L.B. Fah, “Endpoint Detection of Speech Signal using Neural Network,” IEEE Trans on ASSP, pp 271-274, 2000. [12] L.F. Lamel and L.R. Rabiner, “An Improved Endpoint Detector for Isolated Word Recognition,” IEEE Trans on ASSP, Vol.29, No.4, pp 777-785, Aug. 1981. [13] E. Keller, Fundamentals of Speech Synthesis and Speech Recognition Basic Concepts, State of the Art and Future Challenges, John Wiley and Sons, 1994. [14] S.B. Davis and P. Mermelstein, “Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences,” IEEE Trans on ASSP, Vol.28, No.4, pp357-366, Aug. 1980. [15] J.R. Deller, J.G. Proakis and J.H.L. Hansen, Discrete-Time Processing of Speech Signals, Macmillan, 1993. [16] J.N. Holmes, Speech Synthesis and Recognition, Van Nostrand Reinhold, 1988. [17] 葉怡成, 類神經網路模式應用與實作, 儒林出版社, 1993. [18] 中國科學技術大學生物醫學工程跨系委員會, 神經網路及其應用, 儒林出版社, 1993. [19] H.A. Bourlard and N. Morgan, Connectionist Speech Recognition A Hybrid Approach, Kluwer Academic, 1994. [20] M.T. Hagan, H.B. Demuth and M. Beale, Neural Network Design, PWS, 1996. [21] J.S. Jang, C.T. Sun and E. Mizutani, Neuro-Funzzy and Soft Computing, Prentice Hall, 1997. [22] K.J. Astrom, B. Wittenmark, Adaptive Control, 2nd, Addison Wesley, 1995. [23] 蘇木春, 張孝德, 機器學習：類神經網路、模糊系統以及基因演算法則, 全華科技圖書, 1997.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外完全公開 unrestricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0706102-135328.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS