國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,應用行為、嘴部動作與聲音的多模式麥克風控制系統,A Multimodal Microphone Control System

論文名稱 Title	應用行為、嘴部動作與聲音的多模式麥克風控制系統 A Multimodal Microphone Control System
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	103 學年度第 2 學期 The spring semester of Academic Year 103	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	69
研究生 Author	邱凱聖 Kai-sheng Chiou
指導教授 Advisor	李宗南 Chungnan Lee
召集委員 Convenor	傅楸善 Chiou-Shann Fuh
口試委員 Advisory Committee	連震杰, 江明朝, 郭淑美 Jenn-Jier Lien; Ming-chao Chiang; Shu-Mei Guo
口試日期 Date of Exam	2015-07-22	繳交日期 Date of Submission	2015-09-03
關鍵字 Keywords	鼻部偵測、光流算法、膚色偵測、麥克風系統、行為偵測 Skin color detection, Nose detection, Behavior detection, Optical flow, Microphone control system
統計 Statistics	本論文已被瀏覽 5685 次，被下載 434 次 The thesis/dissertation has been browsed 5685 times, has been downloaded 434 times.

中文摘要
目前在市面上存在很多種類的會議室麥克風控制系統，最為常見的控制方法為按鈕式以及聲控式，聲控式麥克風能夠根據輸入音量大小自動得開啟設備，然而，在吵雜環境下或者是當講者發言時與周圍的麥克風距離過近時，會有錯誤開啟的狀況發生，造成錯誤音訊廣播和迴音等嚴重問題，進而影響到會議品質。有鑑於此，我們提出一套多模式麥克風控制系統，使得錯誤開啟麥克風的問題獲得改善。除了聲控模式外，我們透過微型攝影機來分析講者狀態，當講者嘴部有說話動作或是有前傾的行為發生時，本系統便會根據情況自動得控制麥克風狀態。最後，對本論文所提出的系統進行實驗與結果驗證，證明所提出的系統能在吵雜的環境下運作，改善錯誤開啟問題，並且能即時的反應出麥克風系統狀態。
Abstract
Though there are many types of meeting room microphone control system in the market, the most common control method can be classified into either push-button or voice volume. Voice-activated microphones can automatically turn on the device based on the input volume. However, in a noisy environment, an error condition might occur, and it would cause further problems like error audio broadcasting and echo. Those factors would seriously affect the quality of the conference. In view of this, we proposed a multimodal microphone control system to solve those problems. In addition to using the microphone volume, we analyze the behavior of the speaker through a miniature camera. Once a speaker's mouth moves or he does a specific behavior, the system will automatically transform the microphone control system status based on the behaviors. Through intensive experiments, the results prove that the proposed system can response to the behavior of the speaker immediately.

目次 Table of Contents
論文審定書 i 論文公開授權書 ii 誌謝 iii 摘要 iv Abstract v 目錄 vi 圖目錄 viii 表格目錄 x 壹、簡介 1 一. 論文概述 2 二. 論文貢獻 2 三. 論文架構 3 貳、文獻探討 4 參、研究方法 9 一、鼻部偵測(Nose Detection) 10 二、動作偵測(Motion Detection) 15 三、人物行為偵測(Behavior Detection) 21 四、多模式麥克風控制系統(Multimodal Microphone Control System) 22 肆、系統實作 26 一、系統環境及裝置架構 27 二、詳細實作內容 31 三、實驗結果紀錄 42 i. 鼻部偵測模組 42 ii. 嘴部區域動作偵測模組 42 iii. 人物前傾動作偵測模組 43 iv. 多模式麥克風控制系統 44 四、系統算法耗時分析 51 五、多人測試問卷結果 52 伍、結論 53 參考文獻 54 附錄 58

參考文獻 References
[1] 李後賢，李章榮，羅治平, “Microphone Controlling System and Method”. 中華民國專利: TW 201210355 A1, 01 03 2012. [2] 赖建新, “一种麦克风”. 中华人民共和国專利: CN 202551276 U, 21 11 2012. [3] M. H. Yang, D. J. Kriegman and N. Ahuja, “Detecting Faces in Images: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 24.1, pp. 34-58, 2002. [4] V. S. Bhat, J.D. Pujari and Bhavana, “A Hybrid Skin Color Model for Face Detection,” International Journal of Engineering Research and General Science Vol. 2.2, 2014. [5] V. S. Bhat and D. J. Pujari, “Face Detection System using HSV Color Model and Morphing Operations,” Proceedings of National Conference on Women in Science & Engineering (NCWSE’13), 2013. [6] Z. Liu, H. Shen, G. Feng and D. Hu, “Tracking Objects using Shape Context Matching,” Neurocomputing Vol. 83, pp. 47-55, 2012. [7] P. Viola and M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features,” Computer Vision and Pattern Recognition, 2001. CVPR 2001. Computer Society Conference on Proceedings of the 2001 IEEE Vol. 1., 2001. [8] P. Viola and M. Jones, “Robust Real-time Face Detection,” International Journal of Computer Vision Vol. 57.2, pp. 137 - 154, 2004. [9] C. P. Papageorgiou, M. Oren and T. Poggio, “A General Framework for Object Detection,” IEEE Sixth International Conference on Computer Vision, pp. 555-562, 1998. [10] H. Rowley, S. Baluja and T. Kanade, “Neural Network-based Face Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 20.1, pp. 23-38, 1998. [11] Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” Journal of Computer and System Sciences Vol. 55.1, pp. 119-139, 1997. [12] S. Anila and N. Devarajan, “Simple and Fast Face Detection System Based on Edges,” International Journal of Universal Computer Sciences Vol. 1.2, pp. 54-58, 2010. [13] T. Ojala, M. Pietikäinen and D. Harwood, “A Comparative Study of Texture Measures with Classification based on Featured Distributions,” Pattern Recognition Vol. 29.1, pp. 51-59, 1996. [14] T. Ojala, M. Pietikäinen and T. Mäenpää, “Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns,” IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 24.7, pp. 971-987, 2002. [15] J. Chang-yeon, “Face Detection using LBP features,” Final Project Report 77, 2008. [16] L. Zhang, R. Chu, S. Xiang, S. Liao and S. Z. Li, “Face Detection Based on Multi-Block LBP Representation,” Advances in Biometrics. Springer Berlin Heidelberg, pp. 11-18, 2007. [17] T. Ahonen, A. Hadid and M. Pietikäinen, “Face Recognition with Local Binary Patterns,” Computer Vision-ECCV 2004. Springer Berlin Heidelberg, 2004, p. 469–481, 2004. [18] S. Asteriadis, N. Nikolaidis, I. Pitas and M. Pardas, “Detection of Facial Characteristics Based on Edge Information,” VISAPP (2), pp. 247-252, 2007. [19] N. Eveno, A. Caplier and P. Y. Coulon, “Key Points Based Segmentation of Lips,” Proceedings of 2002 IEEE International Conference on Multimedia and Expo Vol. 2., pp. 125-128, 2002. [20] S. K. Bandyopadhyay, “Lip Contour Detection Techniques Based on Front View of Face,” Journal of Global Research in Computer Science Vol. 2.5, pp. 43-46, 2011. [21] E. J. Ong, R. Bowden and G. GU27XH, “Robust Lip-tracking using Rigid Flocks of Selected Linear Predictors,” 8th IEEE International Conference on Automatic Face and Gesture Recognition, 2008. [22] N. Eveno, A. Caplier and P. Y. Coulon, “Accurate and Quasi-Automatic Lip Tracking,” IEEE Transactions on Circuits and Systems for Video Technology Vol. 14.5, pp. 706-715, 2004. [23] H. E Çetingül, E. Erzin, Y. Yemez and A. M. Tekalp, “Multimodal Speaker/speech Recognition using Lip Motion, Lip Texture and Audio,” Signal Processing Vol. 86.12, p. 3549–3558, 2006. [24] R. Brunelli, Template Matching Techniques in Computer Vision: Theory and Practice, 2009. [25] J. Y. Bouguet, “Pyramidal Implementation of the Affine Lucas Kanade Feature Tracker Description of the Algorithm,” Intel Corporation 5, pp. 1-10, 2001. [26] G. Farnebäck, “Two-Frame Motion Estimation Based on Polynomial Expansion,” Image Analysis. Springer Berlin Heidelberg, pp. 363-370, 2003. [27] OpenCV, "Face Recognition with OpenCV," [Online]. Available: http://docs.opencv.org/trunk/modules/contrib/doc/facerec/facerec_tutorial.html. [28] Organick and Elliott I. , A FORTRAN IV Primer. Addison-Wesley, 1966, p. 42. [29] V. A. Oliveira and A. Conci, “Skin Detection using HSV Color Space,” H. Pedrini, J. Marques de Carvalho, Workshops of Sibgrapi, pp. 1-2, 2009.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0614115-120518.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS