Responsive image
博碩士論文 etd-0910106-211507 詳細資訊
Title page for etd-0910106-211507
論文名稱
Title
中文三、四字詞語詞辨識系統之設計研究
A Design of Speech Recognition System for Three-word and Four-word Mandarin Phrases
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
40
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2006-07-26
繳交日期
Date of Submission
2006-09-10
關鍵字
Keywords
梅爾倒頻譜係數、小波轉換、單音辨識、聲調辨識
mono -syllable recognition, pitch identification, wavelet transform, MFCC
統計
Statistics
本論文已被瀏覽 5659 次,被下載 0
The thesis/dissertation has been browsed 5659 times, has been downloaded 0 times.
中文摘要
本論文探討中文三字詞與四字詞之語音辨識問題。我
Abstract
In this thesis, a three-word and four-word Mandarin phrases speech recognition system is developed. This system contains two recordings of twenty-four thousand three-word phrases and twenty-two thousand four-word phrases in the database. And it applies MFCC, mono-syllable HMM’s and speech-text alignment scheme to select the initial phrase candidates. A wavelet transform based vowel segmentation technique and a Mandarin pitch identification method is then followed to increase the phrase correct identification rate and obtain the final answer. Experimental results indicate that 92% and 96% correct rates can be achieved for three-word and four-word phrases recognition problems respectively, under the conditions that the first recording of this database is used for training and the second one is for testing. For the speaker-dependent case, the correct phrase can be found within 1 second, using a PC with Intel Celeron 2.4 GHz CPU and RedHat Linux 9.0 Operation System.
目次 Table of Contents
第1章 緒論...............................................1
1-1 研究動機與目的....................................1
1-2 研究方法..........................................1
1-3 章節概要..........................................3

第2章 系統架構與語音訊號處理相關技術.....................4
2-1 系統架構..........................................4
2-2 切割單音..........................................5
2-2-1 線性預估編碼..................................5
2-2-2 訊號能量(Energy)..............................6
2-2-3 過零率(Zero Crossing Rate)....................6
2-3 特徵萃取(Feature Extraction) .....................8
2-3-1 漢明視窗(Hamming Window ).....................8
2-3-2 梅爾倒頻譜(MFCC)..............................8
2-3-3 餘弦轉換(DCT).................................9
2-3-4 倒濾波器(Low-time Lifter).....................9
2-3-5 倒頻譜係數(Delta-MFCC).......................10
2-4 四聲聲調分類.....................................13
2-4-1 中文語音特性.................................13
2-4-2 自相關函數(ACF,Autocorrelation Function)....14
2-4-3 Mallat演算法.................................16
2-4-4 小波係數能量分佈.............................18
2-5 隱藏式馬可夫模型(HMM)............................19
2-6 單音模型文字比對方法.............................21

第3章 實驗結果..........................................22
3-1 聲調實驗結果.....................................23
3-2 單音模型模擬實驗結果.............................24
3-3 錯誤分析與改善...................................27
3-4 改善策略.........................................28


第4章 結論與建議........................................30
4-1 結論.............................................30
4-2 建議.............................................31

第五章 參考資料.........................................32
參考文獻 References
[1] 賴昭華, “不特定語者中量語詞辨識系統之設計研究” , 國立中山大電機工程研究所碩士論文, 民國91年7月.
[2] 許博閔, “混合式中文人名語音辨識系統之設計研究”, 國立中山大學電機工程研究所碩士論文, 民國93年7月.
[3] 張慶勇, “中文地址語音辨識系統之設計研究”, 國立中山大學電機工程研究所碩士論文, 民國93年7月.
[4] 潘睿慈, “特定語者中文語詞辨識系統之設計研究”, 國立中山大學電機工程研究所碩士論文, 民國94年7月.
[5] L. Rabiner, B. H. Juang, Fundamentals of Speech Recognition, Prentice Hall Signal Processing Series, 1993..
[6] A. M. Kondoz, Digital Speech Coding, New York : John Wiley & Sons Inc., 1994.
[7] S. S. Stevens and J. Volkmann, The relation of pitch of frequency : A revised scale, Am. J. Psychol., 1940.
[8] John R. Deller, J. G.. Proakis, and John H. L. Hansen, Discrete-Time Processing of Speech Signals, New York: Macmillan Pub. Co., 1993
[9] L. R. Rabiner, “A tutorial on hidden Markov modles and selected application in speech recognition”, Proc. IEEE, vol.77, pp. 257-286, Feb. 1989.
[10] V. R. Algazi, K. L. Brown, M. J. Ready, D. H. Irvine, C. L. Cadwell and Sang Chung, “Transform Representation of the Spectra of Acoustic Speech Segment with Applications-I: General Approach and Application to Speech Recognition,” IEEE Trans. Speech and Audio Processing, vol.1, No.2, April 1993.
[11] S.V. Vaseghi, P.N. Conner “Speech modeling using cepstral-time feature matrices in hidden Markov models”, Proc. IEEE,vol.140,NO.5,OCTOBER 1993
[12] Jiqing Han , Wen Gao “Robust telephone speech recognition based on channel compensation”, Patten Recognition,32(1999),106-1067
[13] Mallat, “A theory of mutiresolution signal decomposition: The wavelet transform”, IEEE Trans., PAMI- NO.7, 1989
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外均不公開 not available
開放時間 Available:
校內 Campus:永不公開 not available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 3.141.193.158
論文開放下載的時間是 校外不公開

Your IP address is 3.141.193.158
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code