國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,哼唱式卡拉OK歌曲搜尋系統之設計研究,A Design of Karaoke Music Retrieval System by Acoustic Input

論文名稱 Title	哼唱式卡拉OK歌曲搜尋系統之設計研究 A Design of Karaoke Music Retrieval System by Acoustic Input
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	91 學年度第 2 學期 The spring semester of Academic Year 91	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	61
研究生 Author	蔡旭曜 Shiu-Iau Tsai
指導教授 Advisor	陳志堅 Chih-Chien Chen
召集委員 Convenor	汪啟茂 Chii-Maw Uang
口試委員 Advisory Committee	李聰 Tsung Lee
口試日期 Date of Exam	2003-07-25	繳交日期 Date of Submission	2003-08-11
關鍵字 Keywords	動態時間扭曲、自相關運算、卡拉OK、快速傅利葉、動態規劃、搜尋、基頻擷取 K-NN, Auto-correlation, Pitch tracking, retrieval, DTW, Dynamic Time Wrapping, K-means, Dynamic Programming, FFT, Karaoke
統計 Statistics	本論文已被瀏覽 5674 次，被下載 0 次 The thesis/dissertation has been browsed 5674 times, has been downloaded 0 times.

中文摘要
本論文目的在設計一系統，使電腦能瞭解使用者所哼唱之歌曲曲目，進而點選所欲聽之歌曲。本系統除了使用振幅能量變化來做音符切割還使用K-NN補強其不足之部分。為了追求更好的系統效能我們也引用通訊系統之基本原理提升音高估測之計算效率。除此之外，我們並自行建立一套大量歌曲資料庫，應用在此系統，使其本系統更具有實用性。
Abstract
The objective of this thesis is to design a system that can be used to retrieve the music songs by acoustic input. The system listens to the melody or the partial song singing by the Karaoke users, and then prompts them the whole song paragraphs. Note segmentation is completed by both the magnitude of the song and the k-Nearest Neighbor technique. In order to speed up our system, the pitch period estimation algorithm is rewritten by a theory in communications. Besides, a large popular music database is built to make this system more practical.

目次 Table of Contents
致謝 I 論文摘要 II 目錄 III 圖目錄 V 表目錄 VII 第一章緒論 1 1-1 研究背景 1 1-2 研究方法 3 1-3 基本的系統架構與流程 5 1-4 相關作品概觀 7 1-5 章節概要 9 第二章特徵參數之擷取 10 2-1 前言 10 2-2 基頻擷取 11 2-3 歌唱聲音訊號的特徵分析 19 第三章分類法與資料縮減 22 3-1 前言 22 3-2 k-最接近鄰居分類法 24 3-3 k-means 分群法 26 第四章音符之擷取 28 4-1 前言 28 4-2 音符擷取 29 4-3 哼唱式搜尋之音符擷取 32 4-4 歌唱式搜尋之音符擷取 37 第五章相似度比對 42 5-1動態規劃（Dynamic Programming）原理 42 5-2動態時間扭曲（DTW）原理 45 5-3歌曲辨識模型 50 5-4音高調整 52 第六章實驗結果 55 6-1 前言 55 6-2 以哼唱搜尋之實驗 56 6-3 以歌唱搜尋之實驗 57 第七章結論 58 參考文獻 59

參考文獻 References
[1]M. Flickner, H. S. Sawhney, J.Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker, “Query by image and video content: the QBIC system,” IEEE Computers, Vol. 28, No. 9, pages 23-32, 1995. [2]Smoliar S.W. ,HongJiang Zhang, ”Content based video indexing and retrieval”, IEEE Multimedia , Vol.1 Issue: 2 , pp 62 –72, 1994 [3]Wei Chai, ”Melody Retrieval On The Web”, MS Thesis, Massachusetts Institute of Technology, 2001 [4]T. Foote. Jonathan, “Content-based Retrieval of Music and Audio”,National University of Singapore Heng Mui Keng Terrace, Kent Ridge [5]Roger J. McNab, Lloyd A. Smith, Jan H. Witten, “Signal Processing for Melody Transcription”, Proceeding of the 19th Australasian Computer Science Conference, 1996. [6]W. H. Tseng and J. H. Huang, “A High performance video server for karaoke systems”, IEEE Trans. Consumer Electronics, Vol. 40, No. 3,1994, pages 392-396. [7]L.R. Rabiner, and B.H. Juang, “Fundamentals of Speech Recognition”, Prentice Hall, Englewood Cliffs, New Jersey 1993, Chapter 7 [8]David B. Wagner,”Dynamic Programming”,The Mathematica Journal,Vol.5 ,Miller Freeman,1995. [9]M. Mongeau and D. Sankoff. “Comparison of musical sequences”, Computers and the Humanities, 24:161-175,1990. [10]Ghias, A.; Logan, J.; Chamberlin, D. and Smith, B. C. “Query by Humming:musical information retrieval in an audio database.”Proc. ACM Multimedia, San Francisco, 1995. [11]R. J. McNab, and L. A. Smith, “Melody transcription for interactive applications” Department of Computer Science University of Waikato, New Zealand. [12]B. Gold and L. Rabiner, “Parallel processing techniques for estimating pitch peridots of speech in the time domain”,Journal of the Acoustical Society of America, Volumn 46, Number 2, pages 442-448,1969. [13]J.-S. Roger Jang, “Content-based Music Retrieval Using Linear Scaling and Branch-and-bound Tree Search”, IEEE InternationalConference on Multimedia and Expo, Waseda University Tokyo, Japan, August 2001. [14]Man Mohan Sondhi, “New Methods of Pitch Extraction”, IEEE Transactions on Audio and Electroacoustics, Vol. AU-16, No.2 ,pages 262-266,June 1968. [15]Ferrel G. Stremler , “Introduction to Communication Systems “,2th Edition, Addison-Wesley , Chapter 2. [16]Earl Gose, Richard Johnsonbaugh, and Steve Jost, “Pattern recognition and Image Analysis”,Prentice Hall Inc.,New Jersey,1996. [17]J. T. Tou, and R. C. Gonzalez, “Pattern Recognition Principles”, Addison-Wesley Inc., 1994. [18]John R. Deller,Jr. , John G. Proakis, and John H. L. Hansen, “Discrete-Time Processing of Speech Signals”,New Jersey,Prentice Hall,Inc.,1987. [19]Salosaari, P. And K. Järvelin., ”MUSIR -- a retrieval model for music”,Technical Report RN-1998-1,University of Tampere, Department of Information Studies,1998. [20]Roads, Curtis., ”The Computer Music Tutorial”, Cambridge,MA:MIT Press,c1994. [21]Alan V. Oppenheim, Ronald W. Schafer, ”Discrete-Time Signal Processing”, Prentice Hall, 1993.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外均不公開 not available 開放時間 Available：校內 Campus：永不公開 not available 校外 Off-campus：永不公開 not available 您的 IP(校外) 位址是 18.222.71.231 論文開放下載的時間是校外不公開 Your IP address is 18.222.71.231 This thesis will be available to you on Indicate off-campus access is not available.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS