Responsive image
博碩士論文 etd-0619113-095451 詳細資訊
Title page for etd-0619113-095451
論文名稱
Title
基於OpenEars的語音辨識用於失語症治療
Speech Recognition for Aphasia Treatment Based on OpenEars
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
67
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2013-07-17
繳交日期
Date of Submission
2013-08-06
關鍵字
Keywords
復健、行動裝置、OpenEars、語言治療、失語症、語音辨識
Speech Therapy, OpenEars, Aphasia, Speech Recognition, Mobile devices
統計
Statistics
本論文已被瀏覽 5998 次,被下載 665
The thesis/dissertation has been browsed 5998 times, has been downloaded 665 times.
中文摘要
在台灣,每年都有至少數千位失語症患者需要專業的治療師所提供的復健治療,以便恢復正常與人溝通的能力。但在台治療師人數的不足,導致以下兩大問題:一為醫療能量不足以為所有的病患服務,許多的病人無法預約到門診,而必須要用輪流的方式來進行復健;二是復健的時間也因為供需的失調而必須要壓縮在最低的有效時間內,而無法透過拉長復健時間來協助患者早日恢復語言能力。

Dephasia是個由中山大學資管系學生所開發的用於失語症治療的電腦輔助工具,該系統結合Web以及iPad來協助患者在家進行復健。本篇論文旨在探討以及評估應用語音辨識工具以便讓中文語音類型的輸入可以在客戶端上直接給予即時的回饋之可能性。此工具基於OpenEars製作,透過對語言模型以及字典檔調整以及修改之方式以達到將語音辨識用於失語症患者的語音輸入之目的。

實驗用數據收集自醫療現場共計156樣本,來自十位患者,並在iPad2, iPad4, MBP上評估此工具之效率以及正確率。結果顯示,在經過適當前處理的檔案,以皮爾森相關係數評估此套件產生之評分與治療師評分可得到最高0.59之相關性。結果顯示應用現行之語音辨識軟體到失語症治療是一個具有可能性以及潛力的方案。
Abstract
In Taiwan, there are thousands of aphasic patients who need professional rehabilitation service. The shortage of therapists causes two problems: 1) not all patients can make clinic appointment at the desired time and 2) the average length of rehabilitation session is reduced. These patients need a way to satisfy the need and improve the quality of aphasic treatment.

Dephasia is a computer system for aphasic rehabilitation based on Web and iPad. In this thesis, we proposed a way to provide real-time evaluation for Chinese speech input in aphasic treatment on the client-side for two types of questions, namely repeating sentence and naming practice. This extension is based on OpenEars, a free shared-source SDK for iPhone Voice Recognition and text to speech. We propose some designs and modification to adapt the system to aphasic treatment.

We evaluate our proposed approach using 156 samples from 10 different patients. The evaluation is based on accuracy and efficiency running on iPad2, iPad4 and MBP. The result shows that for ordinary recordings, the Pearson correlation between scores given by proposed solution and the therapists could reach up to 0.59, and the recognition time is within 5 seconds. The result shows that it is possible to apply speech recognition to speech therapy to provide real-time feedback on client-side.
目次 Table of Contents
論文審定書 i
誌謝 ii
摘要 iii
Abstract iv
CHAPTER 1 Introduction 1
1.1. Background 1
1.2. Motivation 2
1.2.1. The Goal 3
1.3. Thesis Organization 4
CHAPTER 2 State of the Art 5
2.1. Aphasia 5
2.1.1. What is Aphasia 5
2.1.2. Classification of Aphasia 7
2.1.3. Treatment of Aphasia 10
2.2. Speech Recognition 13
2.2.1. What is Speech Recognition 13
2.2.2. Hidden Markov Model 16
2.2.3. Evaluation Criteria of Speech Recognition Toolkit 19
2.2.4. PocketSphinx and OpenEars 22
CHAPTER 3 Architecture 25
3.1. Dephasia 25
3.2. Structure of the Solution 26
CHAPTER 4 The Method 30
4.1. Observation from Therapy Scene 30
4.2. Dictionary and Acoustic model 33
4.3. Language Model 34
4.4. Algorithm to Decide Final Score 36
CHAPTER 5 Evaluation 40
5.1. Experimental Design 40
5.2. Accuracy of the Proposed Solution for Show & Tell Questions 42
5.3. Accuracy for Repeating Sentence Questions 49
5.4. Efficiency of the Proposed Solution 51
CHAPTER 6 Conclusion 54
6.1. Future Work 54
References 56
參考文獻 References
[1] M. A. Anusuya and S. K. Katti, “Speech Recognition by Machine : A Review,” International Journal of Computer Science and Information Security, vol. 6, no. 3, pp. 181–205, 2009.
[2] W. Abdulla and N. Kasabov, “The Concepts of Hidden Markov Model in Speech Recognition,” 1999.
[3] American Heart Association, “Stroke and Aphasia,” 2012. [Online]. Available: http://www.strokeassociation.org/idc/groups/heart-public/@wcm/@hcm/documents/downloadable/ucm_309703.pdf.
[4] M. L. Berthier, “Poststroke Aphasia,” Drugs & Aging, vol. 22, no. 2, pp. 163–182, 2005.
[5] A. W. Black and K. A. Lenzo, “Flite: a small fast run-time synthesis engine,” Workshop (ITRW) on Speech Synthesis, 2001.
[6] R. O. C. Department of Health, Executive Yuan, “Statistics of Causes of Death,” 2011. [Online]. Available: http://www.doh.gov.tw/CHT2006/DisplayStatisticFile.aspx?d=87554&s=1.
[7] P. Dowsett, “iOS Module Development Guide.” [Online]. Available: https://wiki.appcelerator.org/display/guides/iOS+Module+Development+Guide.
[8] H. Goodglass, E. Kaplan, and B. Barresi, The Assessment of Aphasia and Related Disorders, 3rd ed. Lippincott Williams & Wilkins, 2001.
[9] D. Huggins-Daines, M. Kumar, and A. Chan, “Pocketsphinx: A free, real-time continuous speech recognition system for hand-held devices,” Acoustics, Speech, pp. 185–188, 2006.
[10] S. K. Gaikwad, B. W. Gawali, and P. Yannawar, “A Review on Speech Recognition Technique,” International Journal of Computer Applications, vol. 10, no. 3, pp. 16–24, Nov. 2010.
[11] P. Lamere, P. Kwok, E. B. Gouvˆ, B. Raj, R. Singh, W. Walker, and P. Wolf, “The CMU Sphinx-4 Speech Recognition System,” in Proc. European Conf. on Speech Communication and Technology, 2003.
[12] A. Lee and T. Kawahara, “Recent development of open-source speech recognition engine julius,” Proceedings : APSIPA ASC 2009 : Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference, pp. 131–137, 2009.
[13] M. R. McNeil and D. A. Copland, “Aphasia Theory,Models,and Classfication,” in in Aphasia and Related Neurogenic Language Disorder, 4th ed., Thieme Medical Pub, 2011, pp. 27–47.
[14] L. Meng-Jen and C. Yu-Chen, “The Nursing Experience of Caring for a MiddleAged Patient with Aphasia Caused by a Stroke,” Cheng Ching Medical Journal, vol. 6, no. 4, pp. 50–57, 2010.
[15] K. Poeck, “Fluency,” in in The Characteristics Of Aphasia, C. Code, Ed. 1989, pp. 23–32.
[16] H. Schuell and J. J. Jenkins, Schuell’s Aphasia in adults: diagnosis, prognosis, and treatment. HarperCollins, 1974.
[17] G. Widmer, “Machine Learning and Pattern Classification”, Course in JKU. [Online]. Available: http://www.cp.jku.at/teaching/ss12/344.009.html.
[18] S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. (Andrew) Liu, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland, The HTK book, 3.4 ed., no. July 2000. Cambridge University Engineering Department, 2009.
[19] H. Yu-Mei, Chung Shu-Er, Lee Miao-Hsiang, Chang Tao-Chang, “The Concise Chinese Aphasia Test(CCAT) and It’s Applications,” The journal of Speech-language-hearing association, pp. 119–137, 1998.
[20] 李淑娥, “成人失語症之復健,” in in 語言病理學基礎 第三卷, 曾進興, Ed. 心理出版社, 1999, pp. 257–287.
[21] “OpenEars.” [Online]. Available: http://www.politepix.com/openears/.
[22] “Titanium SDK.” [Online]. Available: http://www.appcelerator.com/platform/titanium-sdk/.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code