Responsive image
博碩士論文 etd-0725115-141523 詳細資訊
Title page for etd-0725115-141523
論文名稱
Title
樂曲聲紋擷取方法改良研究
Research on Improving Audio Fingerprinting Extraction Method
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
45
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2015-06-15
繳交日期
Date of Submission
2015-09-01
關鍵字
Keywords
地標、聲音指紋、音樂資訊檢索、噪音強健性、雜湊鍵
landmark, music information retrieval, audio fingerprint, noise robustness, hash key
統計
Statistics
本論文已被瀏覽 5682 次,被下載 562
The thesis/dissertation has been browsed 5682 times, has been downloaded 562 times.
中文摘要
在本篇論文中,我們提出一個基於地標的聲音指紋方法,並且有效地在噪音環境 下實現音樂資訊檢索的任務。為了提升本系統的噪音強健性,我們應用高通濾波器濾 波在頻譜的各個音框上。接著從濾波的頻譜中,依水平和垂直的方向找出峰,並且將 這兩個方向所找出的峰取交集。當這些峰根據時間與空間找出相鄰的峰時,則會以地 標的形式來表示彼此之間的關係。每一個地標都代表一個聲紋,最終所有的聲紋就會 被轉換成雜湊鍵存入資料庫中。在查詢時,只要根據查詢歌曲的聲紋分佈,就能快速 地從資料庫中找出最有可能的歌曲資訊。為了評估本系統的噪音強健行,我們收集了 一萬首多種不同風格和語言的歌曲,並且加入多種不同的噪音條件。與知名方法相 比,改變擷取峰的方式將使得每首歌曲的聲紋變得更少也變得更俱鑑別性。且在噪音 存在的情況下,辨識結果也明顯優於知名的方法。
Abstract
In this paper, we propose a robust landmark-based audio fingerprinting method, which is used for music information retrieval under noisy environments effectively. To increase the robustness of the audio fingerprinting, we propose to apply a high pass filter in each frame from the spectrogram. Then we examine the horizontal and vertical peaks simultaneously to reduce the number of peak pair. Landmarks are represented by peak pairs using the temporal and the spectral distances between two adjacent peaks. Finally, the landmarks are mapped to hash values to represent audio fingerprints. In the inquiry stage, according to the distribution of audio fingerprinting, we can quickly identify the most likely song information. To evaluate the proposed approach, 10,000 songs were collected and manually added different types and intensities of noises. Experimental results show that our proposed system significantly outperforms a baseline system based on a well-known method. Since the amount of extracted fingerprints is much smaller, the computational cost in the retrieval procedure is much smaller. Furthermore, the performance degradation in the presence of noise is much more graceful. Thus, the proposed method is indeed very robust to noise.
目次 Table of Contents
List of Figures viii
List of Tables x
Chapter 1 簡介 1
1.1 背景....................................... 1
1.2 研究動機.................................... 2
1.3 文獻回顧.................................... 2
1.4 論文架構.................................... 4
Chapter 2 基於地標法之音樂資訊檢索系統 5
2.1 系統架構.................................... 5
2.2 訊號處理.................................... 5
2.3 濾波頻譜.................................... 6
2.4 擷取peak.................................... 8
2.4.1 建立包絡................................ 9
2.4.2 Forward................................. 10
2.4.3 Backward................................ 10
2.5 建立聲紋.................................... 12
2.6 建立資料庫................................... 13
2.7 歌曲片段查詢 ................................. 14
2.7.1 歌曲編號排名 ............................. 15
2.7.2 時間偏移(Timeoffset)......................... 15
Chapter 3 音樂資訊檢索系統改良方法 18
3.1 系統改良方法 ................................. 18
3.1.1 改變濾波方式 ............................. 19
3.1.2 改變擷取peak方式 .......................... 20
3.1.2.1 建立包絡 .......................... 20
3.1.2.2 Forward ........................... 21
3.1.2.3 Backward .......................... 22
3.2 四種改良方法 ................................. 22
3.2.1 CFFP(channel filter frame peak).................... 23
3.2.2 FFFP(frame filter frame peak)..................... 23
3.2.3 FFCP(frame filter channel peak).................... 23
3.2.4 CFCP(channel filter channel peak) .................. 24
Chapter 4 資料庫與實驗設定 25
4.1 資料庫收集................................... 25
4.2 實驗評估.................................... 26
4.2.1 乾淨歌曲片段實驗 .......................... 26
4.2.2 噪音環境實驗 ............................. 26
4.2.3 資料庫大小比較............................ 29
Chapter 5 結論與未來展望 31
參考文獻 References
[1] J. Xue, G. Wichern, H. Thornburg, and A. Spanias, “Fast query by example of environmental sounds via robust and efficient cluster-based indexing,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5–8, April 2008.
[2] P.Cano,E.Battle,T.Kalker,andJ.Haitsma,“A review of audio fingerprinting,”Journal of VLSI Signal Processing Systems, vol. 41, pp. 271–284, Nov 2005.
[3] C. J. C. Burges, J. C. Platt, and S. Jana, “Distortion discriminant analysis for audio fingerprinting,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 11, pp. 165–174, May 2003.
[4] P. Cano, E. Batle, T. Kalker, and J. Haitsma, “A review of algorithms for audio fingerprinting,” in Multimedia Signal Processing, 2002 IEEE Workshop on, pp. 169–173, 2002.
[5] E. Batlle, J. Masip, and P. Cano, “System analysis and performance tuning for broadcast audio fingerprinting,” Proceedings of 6th International Conference on Digital Audio Effects, 2003.
[6] A.Sinitsyn,“Duplicate song detection using audio fingerprinting for consumer electronics devices,” IEEE International Symposium on Consumer Electronics, pp. 1–6, June 2006.
[7] S. Baluja and M. Covell, “Content fingerprinting using wavelets,” Proceedings of Euro- pean Conference on Visual Media Production(CVMP), Nov 2006.
[8] J. Haitsma and T. Kalker, “A highly robust audio fingerprinting system,” Proceedings of
International Conference on Music Information Retrieval (ISMIR), 2002.
[9] Y. Ke, D. Hoiem, and R. Sukthankar, “Computer vision for music identification,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2005.
[10] D. G. Lowe, “Object recognition from local scale-invariant features,” Proceedings of International Conference on Computer Vision, 1999.
[11] S. Baluja and M. Covell, “Waveprint : Efficient wavelet-based audio fingerprinting,” Journal of Pattern Recognition, vol. 41, pp. 3467–3480, Nov 2008.
[12] A. L. Wang, “An industrial-strength audio search algorithm,” Proceedings of International Conference on Music Information Retrieval (ISMIR), 2003.
[13] A. Wang, “The shazam music recognition service,” Communications of the ACM, pp. 44–48, 2006.
[14] V. Chandrasekhar, M. Sharifi, and D. A. Ross, “Survey and evaluation of audio finger- printing schemes for mobile query-by-example applications.,” Proceedings of International Conference on Music Information Retrieval (ISMIR), vol. 20, pp. 801–806, 2011.
[15] T. Jiang, K. Xiang, J. Lu, R. Wu, X. Li, and F. Dai, “A large scale audio fingerprint- ing system,” Advances in Multimedia Information Processing–PCM 2013, pp. 866–875, 2013.
[16] D. Ellis, “Robust Landmark-Based Audio Fingerprinting.” web resource, http:// labrosa.ee.columbia.edu/matlab/fingerprint/., 2009.
[17] J. Shi, X. Yu, H. Liu, and W. Xiong, “Audio fingerprinting based on salient points for audio retrieval,” 2013.
[18] C.-C. Wang, M.-H. Lin, J.-S. R. Jang, and W. Liou, “An effective re-ranking method based on learning to rank for improving audio fingerprinting,” in Asia-Pacific Signal and
Information Processing Association, 2014 Annual Summit and Conference (APSIPA), pp. 1–4, 2014.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code