Responsive image
博碩士論文 etd-0811103-125211 詳細資訊
Title page for etd-0811103-125211
論文名稱
Title
中文人名語音辨識系統之設計研究
A Design of Speech Recognition System for Chinese Names
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
67
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2003-07-25
繳交日期
Date of Submission
2003-08-11
關鍵字
Keywords
端點偵測、語詞辨識、隱藏式馬可夫模型、梅爾倒頻譜
Mel-cepstrum, hidden Markov model, endpoint detection, phrase recognition
統計
Statistics
本論文已被瀏覽 5646 次,被下載 0
The thesis/dissertation has been browsed 5646 times, has been downloaded 0 times.
中文摘要
中文人名有著在語料的第一個字必為姓氏的特性,所以語音辨識系統在辨識中文人名時,可以利用先辨識姓氏來達到系統分類的目的,如此便可增進系統的效能。

本研究以隱藏式馬可夫模型(hidden Markov model, HMM)為基礎,並一併比較其與分段式機率模型(segmental probability model, SPM)在單音辨識上的優劣。隱藏式馬可夫模型目前廣泛應用在語音辨識,其利用雙重的隨機程序,以狀態(state)的轉移來描述語音產生的方式,來對應語音模型的時變特性。而分段式機率模型是配合單純的中文單音結構,利用其強制分段,不找出系統對應最佳狀態轉移過程,進而達到簡化系統的目的。

本研究所採用的語音資料庫為麥克風中文語料,是在實驗室一般的環境下錄製而成,以用來實現不特定語者(speaker-independent)的中文人名語詞辨識系統。
Abstract
A design of speech recognition system for Chinese names has been established in this thesis. By identifying surname first, that is an unique feature of the Chinese names, the classification accuracy and computational time of the system can be greatly improved.

This research is primarily based on hidden Markov model (HMM), a technique that is widely used in speech recognition. HMM is a doubly stochastic process describing the ways of pronumciation by recording the state transitions according to the time-varing properties of the speech signal. The results of the HMM are compared with those of the segmental probability model (SPM) to figure out better option in recognizing base-syllables. Under the conditions of equal segments, SPM not only suits Mandarin base-syllable structure, but also achieves the goal of simplifying system since it does not need to find the best transformation of the utterance.

A speaker-independent 3000 Chinese names recognition system has been implemented based on the Mandarin microphone database recorded in the laboratory environment.
目次 Table of Contents
致謝……………………………………………………………………Ⅰ
論文摘要………………………………………………………………Ⅱ
目錄……………………………………………………………………Ⅵ
圖表 目錄……………………………………………………………Ⅶ

第一章 緒論...........................................1
1-1 研究動機...........................................1
1-2研究方法............................................2
1-3 論文架構...........................................3

第二章 語音信號處理的基本技術..............4
2-1 語音辨識系統介紹....................................4
2-2 語音信號在辨識系統下前處理介紹......................5
2-2-1 端點偵測..................................5
2-2-2 預強.....................................8
2-2-3 取窗型函數.................................8
2-3 語音切割之研究.................................12
2-4 語音信號的特徵萃取......................16
2-4-1 倒頻譜係數..................................17
2-4-2 倒頻譜迴歸係數...............................18
2-4-3 梅爾倒頻譜係數.................................19

第三章 隱藏式馬可夫模型...........................22
3-1 語音系統下之隱藏式馬可夫模型.......................22
3-2 隱藏式馬可夫模型之建立.............................23
3-2-1機率值之計算 ...............................25
3-2-2正算程序...................................26
3-2-3逆算程序...................................27
3-3參數重估.....................................29
3-3-1狀態轉移機率矩陣參數重估 ......................29
3-3-2狀態觀測機率矩陣參數重估 ......................31
3-4 維特比演算法................................33

第四章 分段式機率模型................................36
4-1 國語單音節的特性...................................36
4-2 分段式機率模型的建立動機...........................38
4-3分段式機率模型的訓練方式...........................40
4-4分段式機率模型的辨認方式...........................42
4-5分段式機率模型的效能加強方式.......................44
4-6多重模型之改良方式.................................46
4-7倒頻譜與倒頻譜迴歸係數結合之兩段辨認架構...........48

第五章 系統設計與實驗結果............................50
5-1 資料庫建立與規劃...................................50
5-1-1中文人名之挑選 ...............................50
5-1-2資料庫錄製方式................................51
5-2 系統設計..........................................53
5-3 實驗結果與比較...................................54
5-3-1中文單音之辨識結果與比較.......................54
5-3-2中文單音姓氏選取實驗...........................56
5-3-3中文人名之辨識結果與比較.......................58

第六章 結論與建議.....................................61
6-1 結論...........................................61
6-2未來展望...........................................62
參考文獻...............................................63
附錄....................................................66
參考文獻 References
[1]Lawrence Rabiner and Biing-Hwang Juang, Fundamentals of Speech
Recognition, New Jersey: Prentice Hall,Inc.,1993.

[2] J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete Time Processing
of Speech Signals, New York: Macmillan Pub. Co., 1993.

[3]Alan V. Oppenheim and Ronald W. Schafer, Discrete-Time Signal
Processing, New Jersey: Prentice Hall, Inc.,1993.

[4]L. R. Rabiner,“A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE , vol. 77 , pp.257 -286 ,
Feb. 1989.

[5]Jeff A. Blimes,“A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models,” International Computer Science Institute, April 1998.

[6]M. B. Gulmezoglu, V. Dzhafarov, M. Keskin, and A. Barkana,“A Novel
Approach to Isolated Word Recognition,” IEEE Trans. Speech and
Audio Processing, vol.7, pp 620-628, Nov. 1999.

[7]M. B. Gulmezoglu, V. Dzhafarov, M. Keskin, and A. Barkana, “The
Common Vector Approach and its Relation to Principal Component
Analysis,” IEEE Trans. Speech and Audio Processing, vol.9, pp
655-662, Sep. 2001.

[8]J. -F. Wang, C. -H. Wu, S. -H. Chang, and J. -Y. Lee, “A Hierarchical Network Model Based on a C/V Segmental Algorithm for Isolated Mandrain Speech Recognition,” IEEE Trans. Signal Processing, vol.39,
pp2141-2146, Sep 1991.

[9]S. -H. Chen, and J. –H. Wang, “Application of Wavelet Transforms for C/V Segmentation on Mandarin Speech Signals,” IEE Proc. –Vis. Image
Signal Process, vol. 148, pp133-139, April 2001.

[10]J. Taboada, S. Feijoo, R. Baisa, and C. Hernandez, “Explicit Estimation
of Speech Boundaries” IEE. Proc. –Sci. Meas. Technol, vol. 141,
pp153-159, May 1994.

[11]L. –S. Huang, and C. –H. Yang, “A Novel Approach to Robust Speech Endpoint Detection in Car Environments,” 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on, Acoustics,
Speech, and Signal Processing, vol.3, pp1751-1754, June 2000.

[12]Y. Wu, and Y. Li, “Robust Speech/Non-Speech Detection in Adverse Conditions Using the Fuzzy Polarity Correlation Method,” 2000 IEEE International Conference on, Systems, Man, and Cybernetics, vol. 4,
pp2935-2939, Oct. 2000.

[13]R. –Y. Lyu, I. –C. Hong, J. –L. Shen, M. –Y. Lee, and L. –S Lee, “Isolated Mandarin Base-Syllable Recognition Based upon Segmental Probability Model,” IEEE Trans. Speech and Audio Processing, vol.6,
pp293-299, May 1998.

[14]P. Hanna, N. Harte, J. Ming, S.Vaseghi, and F. J. Smith, “Variation of Features of Interframe Dependent HMM for Speech Recognition,”
Electronics Letters, vol. 34, pp858 –859, April 1998.

[15]B. H. Juang and L. R. Rabiner,” Mixture Autoregressive Hidden Markov models for speech signals.” IEEE Trans. Speech and Audio
Processing,vol.33 ,pp 1404-1413, 1985.

[16]黃俊豪, “大量語者不特定語句環境下語者辨識系統之特徵設計,” 國
立中山大學電機工程研究所碩士論文, 民國90年6月5日.

[17]黃銘崇, “不特定語者語詞辨識系統之特徵設計,” 國立中山大學電機
工程研究所碩士論文, 民國90年6月5日.

[18]鄭順德, “不特定語者中量語者辨識系統之設計研究,” 國立中山大學
電機電機工程研究所碩士論文, 民國91年7月24日.

[19]賴昭華, “不特定語者中量語詞辨識系統之設計研究,” 國立中山大學
電機工程研究所碩士論文, 民國91年7月24日.
[20]洪一忠, “基於分段機率模型之國語單音節辨認,” 國立台灣大學電機
工程研究所碩士論文, 民國81年6月.

[21]王瑞璋, “基於整段離散機率模型之國語單音節辨認,” 國立台灣大學
電機工程研究所碩士論文, 民國82年6月.

[22]蘇浩岳, “電話語音查號系統之改進,” 國立交通大學電信工程研究所
碩士論文, 民國86年6月.

[22]國立師範大學國音教材編輯委員會篡, “國音學,” 正中書局.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外均不公開 not available
開放時間 Available:
校內 Campus:永不公開 not available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 18.221.98.71
論文開放下載的時間是 校外不公開

Your IP address is 18.221.98.71
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code