Responsive image
博碩士論文 etd-0910112-171741 詳細資訊
Title page for etd-0910112-171741
論文名稱
Title
國語、土耳其語及塔米爾語三語言語音辨識系統之設計研究
A Design of Trilingual Speech Recognition System for Chinese, Turkish and Tamil
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
51
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2012-07-25
繳交日期
Date of Submission
2012-09-10
關鍵字
Keywords
音位結構學、隱藏式馬可夫模型、梅爾頻率倒頻譜係數、線性預估倒頻譜係數、語音辨識
Linear predicted cepstral coefficients, Hidden Markov model, Phonotactics, Mel-frequency cepstral coefficients, Speech recognition system
統計
Statistics
本論文已被瀏覽 5637 次,被下載 211
The thesis/dissertation has been browsed 5637 times, has been downloaded 211 times.
中文摘要
本論文以國語出發,同時研究土耳其語及在南印度與斯里蘭卡所使用的塔米爾語,希冀在語言學習的過程中,了解各語言之文化、歷史與經濟發展。在中國漢唐時代,絲路便是串聯起東方中國、西方土耳其及南方印度的重要經貿橋樑,而現今土耳其及印度同屬世界主要棉花出口國,並與中國同樣具有著新興市場的發展潛力。有鑑於此,本論文設計並實作了國語、土耳其語及塔米爾語三國語言辨識系統,不僅可供學習語言與出國旅遊之用,同時亦能增進吾人對不同文化之了解。
本系統運用線性預估倒頻譜係數及梅爾頻率倒頻譜係數,來萃取語音聲紋之雙特徵參數,並透過隱藏式馬可夫模型及音位結構學之架構,辨識出正確之語詞。 系統之國語訓練方式為錄製一輪2,699筆語詞,來作音節特徵之紀錄;土耳其語及塔米爾語,則考量發音時「重音變化」及「連音音位」之影響,以一次錄製兩個不同聲調之二字詞方式,並採取五輪十次之策略,來進行訓練。針對82,000筆中文語詞、30,795筆土耳其語詞、及3,500筆塔米爾語詞之資料庫,本實作之語詞正確辨識率,可分別達到 88.30%、84.21%及88.74%,而平均辨識時間約在1.5秒以內。吾人並於上述訓練架構下,建置三語言辨識系統,各選取100筆各個語言之常用語詞,對此300筆資料做語言別及語詞正確之判定,系統辨識率可達98%,而平均辨識時間約為2秒。
Abstract
In this thesis, both Turkish and Tamil, a language spoken in southern India and Sri Lanka, are studied in addition to Mandarin Chinese. It is hoped that the history, culture, and economy behind each language can be acquainted, tasted and appreciated during the learning process. In the ancient Chinese Han and Tang Dynasties, the “Silk Road” played the most magnificent role to connect among the Oriental China, the Western Turkey and the Southern India as the international trading corridor. In this modern era, Turkey and India are both the most important cotton exporting countries. Moreover, China, Turkey and India have been showing their potential to the newly emerging markets in the world. Therefore, a trilingual speech recognition system is developed and implemented to help us to learn Chinese, Turkish and Tamil, as well as to enhance our understanding to their history and culture.
In this trilingual system, linear predicted cepstral coefficients, Mel-frequency cepstral coefficients, hidden Markov model and phonotactics are used as the two syllable feature models and the recognition model respectively. For the Chinese system, a 2,699 two-syllable words database is used as the training corpus. For the Turkish and Tamil systems, a database of 10 utterances per mono-syllable is established by applying their pronunciation rules. These 10 utterances are collected through reading 5 rounds of the same mono-syllables twice with tone 1 and tone 4. The correct rates of 88.30%, 84.21%, and 88.74% can be reached for the 82,000 Chinese, 30,795 Turkish, and 3,500 Tamil phrase databases respectively. The computation time for each system is within 1.5 seconds. Furthermore, a trilingual language-speech recognition system for 300 common words, composed of 100 words from each language, is developed. A 98% correct language-phrase recognition rate can be reached with the computation time less than 2 seconds.
目次 Table of Contents
論文審定書 i
摘要 ii
Abstract iii
誌謝 iv
目錄 v
圖次 vii
表次 viii
第一章 緒論 1
1-1 研究動機 1
1-2 研究目的 2
1-3 章節概要 2
第二章 語音學簡介 3
2-1 國語簡介 4
2-1-1 國語語系發展 4
2-1-2 國語發音規則 5
2-2 土耳其語簡介 6
2-2-1 土耳其語系發展 6
2-2-2 土耳其文字介紹及發音規則 7
2-3塔米爾語簡介 9
2-3-1 塔米爾語系發展 10
2-3-2 塔米爾文字介紹及發音規則 11
第三章 語音辨識系統架構 14
3-1 音節端點偵測 15
3-1-1 能量 15
3-1-2 越零率 16
3-2 特徵萃取之預處理辦法 17
3-2-1 預強 17
3-2-2 音框化 17
3-3 特徵萃取 19
3-3-1 梅爾頻率倒頻譜參數 20
3-3-2 線性預估倒頻譜參數 22
3-4 隱藏式馬可夫模型 26
3-4-1 隱藏式馬可夫模型參數 27
3-4-2 隱藏式馬可夫模型遭遇的問題及如何解決 28
3-5 音位結構學 32
第四章 語音辨識系統訓練策略及效能評析 33
4-1 國語辨識系統 33
4-2 土耳其語辨識系統 35
4-3 塔米爾語辨識系統 37
4-4 三語言辨識系統 39
第五章 結論與未來展望 40
參考文獻 41
參考文獻 References
[1] 維基百科,http://zh.wikipedia.org/
[2] Thamil Paadanool-Learn Tamil, http://www.unc.edu/~echeran/paadanool/
[3] 黃啟輝,土耳其語入門,國立編譯館,民國79年
[4] 孫國強,泰米爾語教程,中國傳媒大學出版社,民國96年
[5] Lewis, M. Paul, Ethnologue: Languages of the World, Sixteenth edition, SIL International, 2009
[6] Robert B. Lee, The Phonology of Modern Standard Turkish, Routludge-Taylor & Francis Group, London, UK, 1997
[7] Elinor Keane, Prominence in Tamil, Journal of the International Phonetic Association , 2006
[8] Daniel Jurafsky, James H. Martin, Speech and Language Processing, Prentice Hall, Taiwan, 2009.
[9] Wai C. Chu, Speech Coding Algorithms, Wiley Interscience, US, 2003.
[10] Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon, Spoken Language Processing: A Guide to Theory, Algorithm and System Development Pearson Education Taiwan Ltd, 2005.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code