Responsive image
博碩士論文 etd-0727104-122753 詳細資訊
Title page for etd-0727104-122753
論文名稱
Title
以DSP Base為架構的不特定語句即時語者辨識系統
DSP Base Independent Phrase Real Time Speaker Recognition System
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
114
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2004-07-10
繳交日期
Date of Submission
2004-07-27
關鍵字
Keywords
數位信號處理器、語者辨識系統、高斯混合模型
DSP, Gaussian Mixture Models, Speaker Recognition System
統計
Statistics
本論文已被瀏覽 5742 次,被下載 46
The thesis/dissertation has been browsed 5742 times, has been downloaded 46 times.
中文摘要
本論文是以數位信號處理器(DSP)為基礎之語者辨識系統,為了使訓練出來的語者模型的參數不超過32bit的浮點數所能表示的範圍,所以在演算法上有一些簡化。而以DSP為架構的語者辨識系統主要包括了兩部分:硬體的設定與語者演算法的實現。DSP方面我們使用的是浮點運算的DSP(ADI SHARK系列中的ADSP-21161),而語者辨識的演算法是利用高斯混合模型。經實驗結果顯示,以DSP 為架構的語者辨識在辨識率與速度上皆有不錯的表現。
Abstract
The thesis illustrates a DSP-based speaker recognition system . In order to make the modular within the representation floating-point, we simplify the algorithm. This speaker recognition system is including hardware setting and implementation of speaker algorithm. The DSP chip is float arithmetic DSP(ADSP-21161 of ADI SHARK Series) , the algorithm of speaker recognition is gaussian mixture model. According to result of experiments, the speaker recognition of DSP can gain good recognition and speed efficiency.
目次 Table of Contents
目錄
中文摘要……………………………….…………………………………………..I
英文摘要…………………………………………………………………………..II
目錄………………………………………………………………………………...III
圖目錄…………………………………………………………………………...VIII
表目錄……………………………………………………………………………..XI

第一章緒論……………………………………………………………………….1
1.1前言…………………………………………………………………………1
1.2語者辨識概述……………………………………………………………..2
1.3研究動機…………………………………………………………………..3
1.4研究背景與辨認技術的發展…………………………………………..4
1.5章節概要…………………………………………………………………...9

第二章語者辨識系統
2.1簡介………………………………………………………………………..10
2.2系統架構………………………………………………………………...12
2.2.1特徵萃取流程……………………………………………………….12
2.2.2語者模型訓練流程………………………………………………….13
2.2.3語者識別流程………………………………………………………..14
2.2.4語者確認流程………………………………………………………..15
2.3特徵萃取…………………………………………………………………16
2.3.1去除直流偏壓………………………………………………………...16
2.3.2語音正規化…………………………………………………………...17
2.3.3音框處理……………………………………………………………...18
2.3.4端點偵測演算法……………………………………………………..19
2.3.4.1R-S端點偵測……………………………………………………..19
2.3.5預先強化……………………………………………………………...22
2.3.6漢明窗…………………………………………………………………23
2.3.7線性預測倒頻譜參數……………………………………………….24
2.3.7.1線性預測編碼……………………………………………………24
2.3.7.2線性預測參數倒頻譜轉換………………………………………29
2.3.7.3參數權重調整……………………………………………………31
2.3.8梅爾倒頻譜參數……………………………………………………..32
2.3.8.1快速傅立葉轉換…………………………………………………33
2.3.8.2梅爾頻譜…………………………………………………………35
2.3.8.3梅爾通道能量……………………………………………………37
2.3.8.4對數能量的計算…………………………………………………38
2.3.8.5離散餘弦轉換……………………………………………………39
2.3.8.6差異值係數………………………………………………………40
2.4高斯混合模型…………………………………………………………..41
2.4.1模型描述……………………………………………………………...41
2.4.2模型初始化…………………………………………………………..44
2.4.3最佳可能性估測法………………………………………………….46
2.4.4期望值最佳演算法………………………………………………….47
2.4.5取自然對數期望值最佳演算法……………………………………52
2.4.6語者識別……………………………………………………………...55

第三章硬體架構……………………………………………………………….57
3.1簡介………………………………………………………………………..57
3.2 DSP發展史………………………………………………………………58
3.3 ADSP-21161N系統簡介………………………………………………59
3.4 DSP發展系統………………………………………………..………...60
3.5 ADSP-21161N…………………………………………………..………...65
3.6 Expert Linker…………………………………………………………66
3.7 Audio界面的說明與設定…………………………………………….67
3.8語者辨識硬體架構………………………………………..…………….69
3.8.1硬體設定……………………………………………………………...70
3.8.2語者演算法之實現…………………………………………………..73
3.9 DSP語者辨識操作界面……………………………………………...…75

第四章實驗方法與結果…………………………………………………….76
4.1簡介………………………………………………………………………..76
4.2語料庫……………………………………………………………………..76
4.3特徵參數的規格………………………………………………………...77
4.4實驗規劃………………………………………………………………….78
4.5語者識別實驗……………………………………………………………79
4.5.1各種不同特徵參數對系統的影響……..…………………………79
4.5.1.1實驗方法…………………………………………………………80
4.5.1.2實驗結果…………………………………………………………80
4.5.2原高斯混合模型與取自然數後的高斯混合模型做比……….....81
4.5.2.1實驗方法…………………………………………………………82
4.5.2.2實驗結果…………………………………………………………83
4.5.3初始模型的比較…..………………………………………………….83
4.5.3.1實驗方法…………………………………………………………84
4.5.3.2實驗結果…………………………………………………………84
4.5.4不同人數、訓練語句與測試語句的長短對系統的影響……….85
4.5.4.1實驗方法…………………………………………………………85
4.5.4.2實驗結果…………………………………………………………86
4.5.5DSP語料庫辨識結果………………………………………………..87
4.5.5.1實驗方法…………………………………………………………87
4.5.5.2實驗結果…………………………………………………………87
4.6語者確認實驗……………………………………………………………89
4.6.1所有語者同一個閥值……………………………………………89
4.6.1.1實驗方法…..…………………………………………………89
4.6.1.2實驗結果…..…………………………………………………90
4.6.2 將語者中男性與女性各定一個閥值…………………………..91
4.6.2.1實驗方法…………………..…………………………………91
4.6.2.2實驗結果…………………..…………………………………92
4.6.3每一位語者一個閥值……………………...……………………93
4.6.3.1實驗方法………………………………………..……………93
4.6.3.2實驗結果……………………………………………………94
4.7結語………………………………………………………………………..95

第五章未來展望……………………………………………………………….96
5.1結論………………………………………………………………………..96
5.2未來展望………………………………………………………………….97

參考文獻………………………………………………………………………….98
參考文獻 References
[1] Douglas A. Reynolds and Richard C. Rose,”Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models”
[2] F.Soong et al.,”A vector quantization approach to speaker recognition,”in Proc. IEEE ICASSP,1985,pp.379-382
[3] L. Rudasi and S. A. Zahorian,”Text-independent talker identification with neural networks,” in Proc. IEEE ICASSP, May 1991, pp.389-392.
[4] Markel, J. D., Oshika, B. and Gray, A. H. , “Long-Term Feature Averaging for Speaker Recognition,” IEEE Trans. Acoust, Speech and Signal Processing, Vol. ASSP-25, PP.330-337, August 1977.
[5] Laszlo Rudasi and Stephen A. Zahorizn,”Text-Independent Talker Identification With Neural Networks” in Proc. IEEE
[6] Gish,H.; Schmidt, M. “Text-independent speaker identification”,Signal Processing Magazine, IEEE, Volume: 11 Issue: 4 , Oct. 1994
[7] D. A. Davis and R. C. Rose, and M. J. T. Smith, “PC-based TMS320C30 implementation of Gaussian Mixture Model text-independent speaker recognition system,” in Proc. Int. Conf. Signal Processing Appl., Technol., Nov 1992, pp. 967-973
[8] Bhattacharyya, S.; Srikanthan, T.; Krishnamurthy, P.,” Ideal GMM parameters & posterior log likelihood for speaker verification”, in Proc. IEEE 10-12 Sept. 2001
[9] B. Atal, “Automatic recognition of speakers from their voices,” Proc. IEEE, vol.64, pp.460-475, Apr 1976
[10]T. Matsui and S. Furui, “A text-independent speaker recognition method robust against utterance variations,” in Proc. IEEE ICASSP, 1991, pp. 388-380
[11]A. Higgins, L. Bahler, and J. Porter, ”Voice identification using nearest neighbor distance measure,” in Proc. IEEE ICASSP, Apr. 1993, pp. II-375-II378
[12]B. Atal, “Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification,” J. Acoust. Soc. Amer., vol. 55, pp. 1304-1312, June 1974.
[13]R.E. Bogner, “On talker verification via orthogonal parameters,” IEEE Trans. Acoust., Speech, signal Processing, vol. ASSP-29, pp. 1-12, Feb. 1981.
[14]J.P.Campbell, “Speaker Recognition: A tutorial,” Proc.IEEE, vol.85,pp.14437-1462,Sept.1997
[15] “VisualDSP++ 3.0 Getting Started Guide for Blackfin Family DSPs”, 2002.4, Analog Devices Corp.
[16]”VisualDSP User’s Guide for Blackfin Family DSPs”,2002.4, Analog Devices Crops.
[17]”ADSP-21161 DSP Hardware Reference”,2002.4,Analog Devices Corp.
[18]”Assembler Manual for ADSP-21xx Family DSPs”,2002.5,Analog Devices Corp.
[19]”Linker & Utilities Manual for ADSP-21xx Family DSPs”,2002.5,Analog Devices Corp.
[20]”C Complier & Library Manual for ADSP-21xx Family DSPs”,2002.5,Analog Devices Corp.
[21]”ADSP-21161N EZ-KIT LITE Evaluation System Manual”, 2002.5,Analog Devices Corp.
[22]”VisualDSP++ 3.0 Getting Started Guide for SHAR Family DSPs”, 2002.5,Analog Devices Corp.
[23]”VisualDSP User’s Guide for SHARC Family DSPs”, 2002.5,Analog Devices Corp.
[24]Lawrence Rabiner Biing-Hwang Juang,”FUNDAMENTALS OF SPEECH RECOGNITION”, PTR Prentice-Hall,Inc. A Simon & Schuster Company Englewood Cliffs, New Jersey 07632
[25]HTK BOOK(for HTK Version 3.1)
[26]陳松琳,”以類神經網路為架構之語音辨識系統”,國立中山大學電機工程學系碩士論文,2001
[27]謝芳易,”結合隱藏式馬可夫模型一階動態規劃演算法之連續語音辨識系統”,國立中山大學電機工程學系碩士論文,2003
[28]鍾偉仁,”語者辨認與驗證初步之研究”,國立台灣大學,2000
[29]陳高斌,”應用SOM-PNN混合神經網路在語者識別”,義守大學,2001
[30]古詩峰,”基於小波轉換特徵參數以及使用麥克風和電話語料之大量語者識別系統”,長庚大學,2002
[31]鄭順德,”不特定語句中量語者辨識系統之設計研究”,國立中山大學,2002
[32]黃俊豪,”大量語者不特定語句環境下語者辨識系統之特徵設計”,國立中山大學,2000
[33]林宸生,”數位訊號-影像與語音處理”,台北,全華科技,1997
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內公開,校外永不公開 restricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 35.172.194.25
論文開放下載的時間是 校外不公開

Your IP address is 35.172.194.25
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code