Responsive image
博碩士論文 etd-0711100-041548 詳細資訊
Title page for etd-0711100-041548
論文名稱
Title
多國語言辨識系統之設計研究
A Design Of Multi-Language Identification System
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
57
研究生
Author
指導教授
Advisor
召集委員
Convenor

口試委員
Advisory Committee
口試日期
Date of Exam
2001-07-10
繳交日期
Date of Submission
2000-07-11
關鍵字
Keywords
倒頻譜、語言辨識、視窗程式、向量量化、峰值估測
Cepstrum, Vector Quantization, Language Identification, Formants Estimation, Microsoft Windows Programming
統計
Statistics
本論文已被瀏覽 5716 次,被下載 2198
The thesis/dissertation has been browsed 5716 times, has been downloaded 2198 times.
中文摘要
本文設計了以Microsoft Windows 作為作業平台之多國語言辨識系統。本系統採用語料之共振峰作為特徵並以向量量化方式淢少特徵量。在特徵粹取上,本文採用LPC求根之方法估測共振峰值,在比對方面,本文採用n-Gram碼及HMM模型作為比對方法。此外,本文亦提出一種新的距離量度應用於VQ上。

Abstract
A Microsoft Windows program is designed to implement a Multi-Language Identification system based on formants estimation and vector quantization classifier with n-Gram and HMM. LPC is used here as an effective method for formants feature extraction of the speakers, and a new method for distance measure of VQ is also proposed.

目次 Table of Contents
目錄
頁次
論文摘要 II
致謝 III
主要圖表目錄 VI
第一章 緒論
1-1簡介(Introduction) 1
1-2 語言辨識系統介紹 2
1-3 論文主題 4
1-4 論文架構 5
第二章 語音訊號處理
2-1 短時域分析(Short Time Analysis) 6
2-2 預強(Pre-emphasis)與靜音切割 11
2-3 聲紋特徵(Formants) 15
2-4 利用倒頻譜(Cepstrum)作共振峰估測 16
2-5 利用線性預估編碼(LPC)作共振峰估測 19
第三章 語音編碼(Coding)
3-1 編碼(Coding)與分群(Cluster) 23
3-2 向量量化(Vector Quantization) 25
3-3 k-Means 28
3-4 Fuzzy k-Means 30
3-5 距離加權向量量化器 34
第四章 語言特徵與辨識
4-1 語言的特徵 42
4-2 n-Gram 44
4-3 Hidden Markov Model 45
4-4 應用 HMM 於 n-Gram 碼之比對 49
第五章 系統設計與實作
5-1 物件導向的系統規劃 50
5-2 系統規格 52
5-3 OGI-TS 資料庫 55
5-4 結果與未來展望 57

參考文獻 References
Reference:
[1] Marc A. Zissman, “Comparison of Four Approaches to Automatic Language Identification of Telephone Speech”, in IEEE Transactions On Speech and Audio Processing. Vol. 4, NO. 1, January 1996

[2] Alan V.Oppenheim, Ronald W.Schafer, “Discrete-Time Signal Processing” Prentice Hall.

[3] J. T. Foil, “Language identification using noisy speech,” in Proc. ICASSP ’86, vol. 2, Apr. 1986, pp. 861-864.

[4] M. Sugiyama, “Automatic language recognition using acoustic features,” in Proc. ICASSP ’91, vol. 2, May 1991, pp.831-816.

[5] R. B. Ives, “A minimal rule AI expert system for real-time classification of natural spoken languages,” in Proc. Second Ann. Artifical Intell. Adv. Comput. Technol. Conf., Long Beach, CA, May 1986, pp. 337-340.

[6] Y. K. Muthusamy and R. A. Cole, “Automatic segmentation and identification of ten languages using telephone speech,” in Proc. ICSLP ’92, vol. 2, Oct. 1992, pp.1007-1010

[7] John, R.Deller, John G.Proaskis, John H.L.Hansen, “Discrete-time processing of speech signals”.

[8] R. M. Gray, A.Buzo, A.H. Gray, Jr., and Y. Matsuyama, “Distortion measures for speech processing,” IEEE Trans. Acoustics, Speech, Signal Proc., ASSP-28 (4): 367-376, August 1980.

[9] 劉振源, “類神經網路模型與語音辨識”, 全華出版社

[10] Y. K. Muthusamy, E. Barnard, and R. A. Cole, “Reviewing automatic language identification,” IEEE Signal Processing Mag., vol. 11, no. 4, pp. 33-41, Oct. 1994

[11] Lutz Welling, Hermann Ney, “Formant Estimation for Speech Recognition”, IEEE Transactions on Speech and Audio Processing, VOL. 6, NO. 1, January 1998, pp. 36-48.

[12] 林壽, 王理嘉, “語音學教程”, 五南圖書出版公司

[13] Deller, Proakis, Hansen, “Discrete-Time Processing of Speech Signals” p. 105.

[14]E. R. Ruspini, “A new approach to clustering,” Inform. Contr., vol. 19, pp.22-32, 1969.

[15]J. C. Bezdek, “A convergence theorem for the fuzzy IDODATA clustering algorithms, “IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-2, pp. 1-8, Jan. 1980.



電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內立即公開,校外一年後公開 off campus withheld
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code