論文使用權限 Thesis access permission:校內校外均不公開 not available
開放時間 Available:
校內 Campus:永不公開 not available
校外 Off-campus:永不公開 not available
論文名稱 Title |
中文三、四字詞語詞辨識系統之設計研究 A Design of Speech Recognition System for Three-word and Four-word Mandarin Phrases |
||
系所名稱 Department |
|||
畢業學年期 Year, semester |
語文別 Language |
||
學位類別 Degree |
頁數 Number of pages |
40 |
|
研究生 Author |
|||
指導教授 Advisor |
|||
召集委員 Convenor |
|||
口試委員 Advisory Committee |
|||
口試日期 Date of Exam |
2006-07-26 |
繳交日期 Date of Submission |
2006-09-10 |
關鍵字 Keywords |
梅爾倒頻譜係數、小波轉換、單音辨識、聲調辨識 mono -syllable recognition, pitch identification, wavelet transform, MFCC |
||
統計 Statistics |
本論文已被瀏覽 5659 次,被下載 0 次 The thesis/dissertation has been browsed 5659 times, has been downloaded 0 times. |
中文摘要 |
本論文探討中文三字詞與四字詞之語音辨識問題。我 |
Abstract |
In this thesis, a three-word and four-word Mandarin phrases speech recognition system is developed. This system contains two recordings of twenty-four thousand three-word phrases and twenty-two thousand four-word phrases in the database. And it applies MFCC, mono-syllable HMM’s and speech-text alignment scheme to select the initial phrase candidates. A wavelet transform based vowel segmentation technique and a Mandarin pitch identification method is then followed to increase the phrase correct identification rate and obtain the final answer. Experimental results indicate that 92% and 96% correct rates can be achieved for three-word and four-word phrases recognition problems respectively, under the conditions that the first recording of this database is used for training and the second one is for testing. For the speaker-dependent case, the correct phrase can be found within 1 second, using a PC with Intel Celeron 2.4 GHz CPU and RedHat Linux 9.0 Operation System. |
目次 Table of Contents |
第1章 緒論...............................................1 1-1 研究動機與目的....................................1 1-2 研究方法..........................................1 1-3 章節概要..........................................3 第2章 系統架構與語音訊號處理相關技術.....................4 2-1 系統架構..........................................4 2-2 切割單音..........................................5 2-2-1 線性預估編碼..................................5 2-2-2 訊號能量(Energy)..............................6 2-2-3 過零率(Zero Crossing Rate)....................6 2-3 特徵萃取(Feature Extraction) .....................8 2-3-1 漢明視窗(Hamming Window ).....................8 2-3-2 梅爾倒頻譜(MFCC)..............................8 2-3-3 餘弦轉換(DCT).................................9 2-3-4 倒濾波器(Low-time Lifter).....................9 2-3-5 倒頻譜係數(Delta-MFCC).......................10 2-4 四聲聲調分類.....................................13 2-4-1 中文語音特性.................................13 2-4-2 自相關函數(ACF,Autocorrelation Function)....14 2-4-3 Mallat演算法.................................16 2-4-4 小波係數能量分佈.............................18 2-5 隱藏式馬可夫模型(HMM)............................19 2-6 單音模型文字比對方法.............................21 第3章 實驗結果..........................................22 3-1 聲調實驗結果.....................................23 3-2 單音模型模擬實驗結果.............................24 3-3 錯誤分析與改善...................................27 3-4 改善策略.........................................28 第4章 結論與建議........................................30 4-1 結論.............................................30 4-2 建議.............................................31 第五章 參考資料.........................................32 |
參考文獻 References |
[1] 賴昭華, “不特定語者中量語詞辨識系統之設計研究” , 國立中山大電機工程研究所碩士論文, 民國91年7月. [2] 許博閔, “混合式中文人名語音辨識系統之設計研究”, 國立中山大學電機工程研究所碩士論文, 民國93年7月. [3] 張慶勇, “中文地址語音辨識系統之設計研究”, 國立中山大學電機工程研究所碩士論文, 民國93年7月. [4] 潘睿慈, “特定語者中文語詞辨識系統之設計研究”, 國立中山大學電機工程研究所碩士論文, 民國94年7月. [5] L. Rabiner, B. H. Juang, Fundamentals of Speech Recognition, Prentice Hall Signal Processing Series, 1993.. [6] A. M. Kondoz, Digital Speech Coding, New York : John Wiley & Sons Inc., 1994. [7] S. S. Stevens and J. Volkmann, The relation of pitch of frequency : A revised scale, Am. J. Psychol., 1940. [8] John R. Deller, J. G.. Proakis, and John H. L. Hansen, Discrete-Time Processing of Speech Signals, New York: Macmillan Pub. Co., 1993 [9] L. R. Rabiner, “A tutorial on hidden Markov modles and selected application in speech recognition”, Proc. IEEE, vol.77, pp. 257-286, Feb. 1989. [10] V. R. Algazi, K. L. Brown, M. J. Ready, D. H. Irvine, C. L. Cadwell and Sang Chung, “Transform Representation of the Spectra of Acoustic Speech Segment with Applications-I: General Approach and Application to Speech Recognition,” IEEE Trans. Speech and Audio Processing, vol.1, No.2, April 1993. [11] S.V. Vaseghi, P.N. Conner “Speech modeling using cepstral-time feature matrices in hidden Markov models”, Proc. IEEE,vol.140,NO.5,OCTOBER 1993 [12] Jiqing Han , Wen Gao “Robust telephone speech recognition based on channel compensation”, Patten Recognition,32(1999),106-1067 [13] Mallat, “A theory of mutiresolution signal decomposition: The wavelet transform”, IEEE Trans., PAMI- NO.7, 1989 |
電子全文 Fulltext |
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。 論文使用權限 Thesis access permission:校內校外均不公開 not available 開放時間 Available: 校內 Campus:永不公開 not available 校外 Off-campus:永不公開 not available 您的 IP(校外) 位址是 3.141.193.158 論文開放下載的時間是 校外不公開 Your IP address is 3.141.193.158 This thesis will be available to you on Indicate off-campus access is not available. |
紙本論文 Printed copies |
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。 開放時間 available 已公開 available |
QR Code |