國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,土耳其文語音辨識系統之設計研究,A Design of Turkish Speech Recognition System

論文名稱 Title	土耳其文語音辨識系統之設計研究 A Design of Turkish Speech Recognition System
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	99 學年度第 2 學期 The spring semester of Academic Year 99	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	80
研究生 Author	陳冠綸 Guan-lun Chen
指導教授 Advisor	陳志堅 Chih-Chien Chen
召集委員 Convenor	柏小松 Xiao-Song Bor
口試委員 Advisory Committee	李聰, 盧而輝, 汪啟茂 Tsung Lee; Er-Hui Lu; Chii-Maw Uang
口試日期 Date of Exam	2011-07-18	繳交日期 Date of Submission	2011-08-22
關鍵字 Keywords	音位結構學、隱藏式馬可夫模型、線性預估倒頻譜係數、梅爾倒頻譜係數、土耳其文語音辨識系統 Turkish speech recognition system, Hidden Markov model, Linear predicted cepstral coefficients, Mel-frequency cepstral coefficients, phonotactics
統計 Statistics	本論文已被瀏覽 5666 次，被下載 478 次 The thesis/dissertation has been browsed 5666 times, has been downloaded 478 times.

中文摘要
土耳其共和國成立於西元1923年，為世界之文明古國，她地處於亞洲和歐洲之交界，有著豐沛的文化資產，其境內第一大城伊斯坦堡為著名之觀光勝地。伊斯坦堡建於西元330年，古地名又稱君士坦丁堡或拜占庭。其為古羅馬時期，由君士坦丁大帝在拜占庭所佔領及開拓之城域，戰略地理上極為易守難攻，類似當時之首都羅馬。土耳其地處文化衝擊之要地，歷經戰爭之變革，有著豐富的資產等著人們發掘，無論在旅遊景點、歷史建築、古典音樂、特色美食，藝術收藏等皆聞名於世。吾人希冀建立一套有效的語音辨識系統，經由學習土耳其語的過程，窺探土耳其過去的文化之美與歷史足跡，並增廣旅遊與生活之視野。本論文探討土耳其文語音辨識系統之設計與實作策略。吾人依據土耳其文發音特性，歸納出395個土耳其文常用單音節作為系統主要的訓練與辨識基礎。語音訓練資料庫之錄製採取陰平一聲與去聲四聲兩種音調連續錄製之策略，以彰顯土耳其文中重音與非重音之區別。其中一聲為音高維持高值之音調，四聲為音高由高至低值之音調。唸完一個單音類別之一四聲後，接著唸下一類之單音，將395類單音唸完一輪，可得每單音兩次之訓練語料。本系統使用6輪12次之訓練機制，並採用梅爾頻率倒頻譜係數與線性預估倒頻譜係數，來作特徵參數之萃取，運用隱藏式馬可夫模型來作單音節之辨識，最後經由音位結構學之規則比對，獲至最佳之辨識結果。在CPU為時脈2.8 GHz的AMD Athlon X2 2400之個人電腦與Ubuntu 9.04之作業系統環境下。針對3,644筆土耳其語詞資料庫，本系統可獲得87.29%之正確辨識率。系統平均所需辨識時間約在1.5秒以內，而總訓練時間約為二小時。
Abstract
The Republic of Turkey, founded in 1923, is a well-known ancient country with abundant cultural heritage and great junction location of the Asian and European Continents. Istanbul is the largest city of this country with her old name Constantinople or Byzantium. She was established by Constantinus I Magnus in A.D. 330 during the era of the Roman Empire, to serve as a well-fortified castle like Rome. Numerous attractions on historical architecture, ancient music, gourmet cuisine, and art collections can be explored and appreciated. It is our objective to build a language system that can help us to learn Turkish, to savor the beauty of her culture, and to widen our vision of travel and living. This thesis investigates the design and implementation strategies for a Turkish speech recognition system. It utilizes the speech features of the 395 common Turkish mono-syllables as the major training and recognition methodology. A training database of 12 utterances per mono-syllable is established by applying Turkish pronunciation rules. These 12 utterances are collected through reading 6 rounds of the same mono-syllables twice with different tones. The first pronounced pattern has high pitch of tone 1, while the second one has falling pitch of tone 4. Mel-frequency cepstral coefficients, linear predicted cepstral coefficients, and hidden Markov model are used as the two syllable feature models and the recognition model respectively. Under the AMD 2.8 GHz Athlon X2 2400 personal computer and Ubuntu 9.04 operating system environment, correct phrase recognition rates of 87.29% can be reached using phonotactical rules for a 3,644 vocabulary Turkish phrase database. The average computation time for the each system is less than 1.5 seconds, and the training time for the systems is about two hours.

目次 Table of Contents
論文審定書 i 摘要 ii Abstract iii 誌謝 iv 目錄 v 圖次 viii 表次 ix 第一章緒論 1 1-1 研究動機 1 1-2 研究目的 3 1-3 論文章節概要 3 第二章土耳其文語音學基礎 5 2-1 土耳其語簡介 5 2-1-1 語音特徵與分支 5 2-1-2 土耳其語族的歷史演進 7 2-1-3 土耳其語族之文字 8 2-1-4 土耳其語言使用地區 9 2-2 土耳其之字母介紹 11 2-3 土耳其之發音規則介紹 12 2-3-1 母音發音規則 12 2-3-2 子音發音規則 14 第三章語音辨識系統的流程與數學原理 16 3-1 音節切割 17 3-1-1 能量 17 3-1-2 越零率 18 3-1-3 線性預估係數誤差能量 19 3-2 語音特徵萃取前置處理 21 3-2-1 高頻預強濾波器 21 3-2-2 加視窗 22 3-3 語音特徵萃取流程 24 3-3-1 梅爾頻率倒頻譜係數 25 3-3-2 線性預估倒頻譜係數 30 第四章語音模型訓練與辨識流程 38 4-1 隱藏式馬可夫模型介紹 38 4-2 隱藏式馬可夫模型參數定義 40 4-3 隱藏式馬可夫模型遭遇之問題與解決方式 42 4-3-1 估算狀態機率 42 4-3-2 最佳狀態序列問題 46 4-3-3 模型參數估算 49 第五章辨識系統之訓練策略 51 5-1 硬體配備與軟體規範 51 5-2 單音模型分類 53 5-3 模擬詞彙建構 54 5-4 單音模型之訓練方式 54 5-4-1 辨識率與單音訓練次數之關係 55 5-4-2 辨識率與錄製不同個數單音之關係 59 5-4-3 辨識率與聲調之關係 61 第六章土耳其文語音辨識系統實作成果與辨識效能 64 6-1 土耳其語能力測驗檢定辨識系統 64 6-2 土耳其常用人名辨識系統 66 第七章結論與未來展望 68 參考文獻 69

參考文獻 References
[1] 維基百科，http://zh.wikipedia.org/ [2] 黃啟輝，土耳其語入門，國立編譯館，民國79年。 [3] 周正清，土耳其語漢語詞典，商務印書館，民國97年。 [4] Daniel Jurafsky, James H. Martin, Speech and Language Processing, Prentice Hall, Taiwan, 2009. [5] Wai C. Chu, Speech Coding Algorithms, Wiley Interscience, US, 2003. [6] Thomas F. Quatieri, Discrete-Time Speech Signal Processing, Prentice Hall, Taiwan, 2002. [7] Lawrence R. Rabiner, Ronald W. Schafer, Theory and Applications of Digital Speech Processing, Prentice Hall, Taiwan, 2010. [8] X. Huang, A. Acero, and H.W. Hon, Spoken Language Processing, Prentice Hall, Taiwan, 2001. [9] 王小川，語音訊號處理，全華圖書出版社，民國98年 [10] 歐語能力評量共同參考標準http://www.coe.int/t/dg4/linguistic/cadre_en.asp [11] 歐洲語言證書協會,http://www.telc.net/en/

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0822111-142833.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS