國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,中文三、四字詞語詞辨識系統之設計研究,A Design of Speech Recognition System for Three-word and Four-word Mandarin Phrases

論文名稱 Title	中文三、四字詞語詞辨識系統之設計研究 A Design of Speech Recognition System for Three-word and Four-word Mandarin Phrases
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	94 學年度第 2 學期 The spring semester of Academic Year 94	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	40
研究生 Author	蘇吉信 Ji-sin Sue
指導教授 Advisor	陳志堅 Chih-Chien Chen
召集委員 Convenor	汪啟茂 Chii-Maw Uang
口試委員 Advisory Committee	李聰 Tsung Lee
口試日期 Date of Exam	2006-07-26	繳交日期 Date of Submission	2006-09-10
關鍵字 Keywords	梅爾倒頻譜係數、小波轉換、單音辨識、聲調辨識 mono -syllable recognition, pitch identification, wavelet transform, MFCC
統計 Statistics	本論文已被瀏覽 5659 次，被下載 0 次 The thesis/dissertation has been browsed 5659 times, has been downloaded 0 times.

中文摘要
本論文探討中文三字詞與四字詞之語音辨識問題。我
Abstract
In this thesis, a three-word and four-word Mandarin phrases speech recognition system is developed. This system contains two recordings of twenty-four thousand three-word phrases and twenty-two thousand four-word phrases in the database. And it applies MFCC, mono-syllable HMM’s and speech-text alignment scheme to select the initial phrase candidates. A wavelet transform based vowel segmentation technique and a Mandarin pitch identification method is then followed to increase the phrase correct identification rate and obtain the final answer. Experimental results indicate that 92% and 96% correct rates can be achieved for three-word and four-word phrases recognition problems respectively, under the conditions that the first recording of this database is used for training and the second one is for testing. For the speaker-dependent case, the correct phrase can be found within 1 second, using a PC with Intel Celeron 2.4 GHz CPU and RedHat Linux 9.0 Operation System.

目次 Table of Contents
第1章緒論...............................................1 1-1 研究動機與目的....................................1 1-2 研究方法..........................................1 1-3 章節概要..........................................3 第2章系統架構與語音訊號處理相關技術.....................4 2-1 系統架構..........................................4 2-2 切割單音..........................................5 2-2-1 線性預估編碼..................................5 2-2-2 訊號能量(Energy)..............................6 2-2-3 過零率(Zero Crossing Rate)....................6 2-3 特徵萃取(Feature Extraction) .....................8 2-3-1 漢明視窗(Hamming Window ).....................8 2-3-2 梅爾倒頻譜(MFCC)..............................8 2-3-3 餘弦轉換(DCT).................................9 2-3-4 倒濾波器(Low-time Lifter).....................9 2-3-5 倒頻譜係數(Delta-MFCC).......................10 2-4 四聲聲調分類.....................................13 2-4-1 中文語音特性.................................13 2-4-2 自相關函數(ACF，Autocorrelation Function)....14 2-4-3 Mallat演算法.................................16 2-4-4 小波係數能量分佈.............................18 2-5 隱藏式馬可夫模型(HMM)............................19 2-6 單音模型文字比對方法.............................21 第3章實驗結果..........................................22 3-1 聲調實驗結果.....................................23 3-2 單音模型模擬實驗結果.............................24 3-3 錯誤分析與改善...................................27 3-4 改善策略.........................................28 第4章結論與建議........................................30 4-1 結論.............................................30 4-2 建議.............................................31 第五章參考資料.........................................32

參考文獻 References
[1] 賴昭華, “不特定語者中量語詞辨識系統之設計研究” , 國立中山大電機工程研究所碩士論文, 民國91年7月. [2] 許博閔, “混合式中文人名語音辨識系統之設計研究”, 國立中山大學電機工程研究所碩士論文, 民國93年7月. [3] 張慶勇, “中文地址語音辨識系統之設計研究”, 國立中山大學電機工程研究所碩士論文, 民國93年7月. [4] 潘睿慈, “特定語者中文語詞辨識系統之設計研究”, 國立中山大學電機工程研究所碩士論文, 民國94年7月. [5] L. Rabiner, B. H. Juang, Fundamentals of Speech Recognition, Prentice Hall Signal Processing Series, 1993.. [6] A. M. Kondoz, Digital Speech Coding, New York : John Wiley & Sons Inc., 1994. [7] S. S. Stevens and J. Volkmann, The relation of pitch of frequency : A revised scale, Am. J. Psychol., 1940. [8] John R. Deller, J. G.. Proakis, and John H. L. Hansen, Discrete-Time Processing of Speech Signals, New York: Macmillan Pub. Co., 1993 [9] L. R. Rabiner, “A tutorial on hidden Markov modles and selected application in speech recognition”, Proc. IEEE, vol.77, pp. 257-286, Feb. 1989. [10] V. R. Algazi, K. L. Brown, M. J. Ready, D. H. Irvine, C. L. Cadwell and Sang Chung, “Transform Representation of the Spectra of Acoustic Speech Segment with Applications－I: General Approach and Application to Speech Recognition,” IEEE Trans. Speech and Audio Processing, vol.1, No.2, April 1993. [11] S.V. Vaseghi, P.N. Conner “Speech modeling using cepstral-time feature matrices in hidden Markov models”, Proc. IEEE,vol.140,NO.5,OCTOBER 1993 [12] Jiqing Han , Wen Gao “Robust telephone speech recognition based on channel compensation”, Patten Recognition,32（1999）,106-1067 [13] Mallat, “A theory of mutiresolution signal decomposition: The wavelet transform”, IEEE Trans., PAMI- NO.7, 1989

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外均不公開 not available 開放時間 Available：校內 Campus：永不公開 not available 校外 Off-campus：永不公開 not available 您的 IP(校外) 位址是 3.141.193.158 論文開放下載的時間是校外不公開 Your IP address is 3.141.193.158 This thesis will be available to you on Indicate off-campus access is not available.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS