Responsive image
博碩士論文 etd-0126110-115934 詳細資訊
Title page for etd-0126110-115934
論文名稱
Title
以碎形正交基底和時間情境圖為基礎進行之視訊檢索
Video retrieval based on fractal orthogonal bases and temporal graph
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
80
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2010-01-20
繳交日期
Date of Submission
2010-01-26
關鍵字
Keywords
碎形正交基底、視訊檢索
Fractal Orthogonal Bases, Video Retrieval
統計
Statistics
本論文已被瀏覽 5665 次,被下載 0
The thesis/dissertation has been browsed 5665 times, has been downloaded 0 times.
中文摘要
本論文以碎形(fractal)正交基做為視訊內容相似性量測之基準,並採用時間情境圖顯示視訊在時間軸上之結構和流向,主要藉由五個步驟進行視訊檢索,此五步驟分別為:(1)影片摘要,從視訊中擷取關鍵影格;(2)正規劃分群,將關鍵影格分類;(3)時間情境圖,根據關鍵影格在視訊中之時間性;(4)有向圖轉字串,此轉換為一對一對映;(5)字串相似性量測,分成視訊結構與內容部分。根據上述之方法,根據主線劇情和支線劇情來實現視訊之結構性。因此,使用者不僅有效率地瀏覽此視訊並且可以針對任何有興趣之視訊結構。
一般而言,視訊搜尋可以分成三大部分:分析視訊、分析組織和相似性量測。為了建立此系統,在分析視訊階段,本研究採用失真變異度準則來擷取關鍵影格,利用位移補償重建影像序列,找出最小失真度之重建影像序列,此參考即為關鍵影格。在組織分析結果方面,首先分類關鍵影格種類,利用N-cut分群進行分類關鍵影格。依照上述,以分群結果為節點,相鄰關鍵影格為邊,其相鄰關鍵影格為影片中之時間性相鄰,建立時間情境圖。此時間情境圖為一有向圖,本論文提出一個新方法來處理視訊結構性分析,其中利用最短路徑搜尋,紀錄其路徑轉換成有結構性字串找出主結構字串和副結構字串。
在相似性量測方面,此階段分成兩部分為字串結構和影片內容。在字串結構上,本研究採用編輯距離(edit-distance)進行主結構字串和副結構字串之相似性量測。經由上述,取得較高相似之字串,以碎形正交基底為基礎進行視訊內容相似性量測。本研究影像序列為Open Video提供,而搜尋結果證明本系統擁有良好搜尋效果且有效率找出使用者興趣之結構。
Abstract
In this paper, we present a structural video for video retrieval with fractal orthogonal bases composed of the five steps: video summarization (extract key-frames from video), normalized group cuts (classify key-frames), temporal graph (according to key-frames time in video), transformation of a directed graph into string (the process of transformation is one-to-one mapping), and comparison of string similarity (contain of sting architecture and content), to establish the framework of the video contents. With the above-mentioned information, the structure of the video and its complementary knowledge can be built up according to main line and branch line. Therefore, users can not only browse the video efficiently but also focus on the structure what they are interest.
In order to construct the fundamental system, we employ distortion metric that extract key-frames from video and classify key-frames according to normalized group cuts that shot are linked together based on their content. After constructing the relation graph, the graph is transformed into string that has enriched structure. The result clusters form a directed graph and a shortest path algorithm is proposed to find main structure of video. In string similarity, it divides into string architecture and content. In string architecture, we adopt edit distance in main structure and recursive branch line. After comparison of string similarity in architecture, it gets the high similarity string comparing with fractal orthogonal bases that guarantee the similar index has the similar image the characteristic union support vector clustering. The results demonstrate that our system can achieve better performance and information coverage.
目次 Table of Contents
中文摘要 2
英文摘要 3
目錄 4
圖目錄 5
表目錄 7
第一章 視訊檢索之相關研究 8
第1節 序論 8
第2節 本論文相關研究 10
1-2-1 分鏡變化偵測 11
1-2-2 擷取關鍵影格 16
1-2-3 分群法 21
第3節 視訊檢索相關研究 25
1-3-1 以物體移動軌跡當基礎進行之視訊檢索 25
1-3-2 多重形態與多階層次排名計劃針對大規模之視訊檢索 28
1-3-3 以正交基底為基礎之Multiple-Instance影像資料擷取方法 31
第二章 基礎理論 34
第1節 失真變異度 34
第2節 分群法 37
2-2-1 Nornalized Cut 分群演算法 37
2-2-2 最佳化分割 39
第3節 時間情境圖 39
第4節 碎形正交基底 41
第三章實驗步驟 46
第四章實驗結果 60
第1節 測式序列 60
第2節 實驗結果與分析 62
第3節 與其他方法比較 64
第五章結論與未來展望 71
第1節 結論 71
第2節 未來展望 71
參考文獻 72
參考文獻 References
[1] Cernekova, Z., Pitas, I., and Nikou, C., “Information theory-based shot cut/fade detection and video summarization,” IEEE Trans. Circuits and Syst. for Video Technology, Vol. 16, No. 1, pp. 82-91, 2006.
[2] Ngo, C.-W., Ma, Y.-F., and Zhang, H.-J., “Video summarization and scene detection by graph modeling,” IEEE Trans. Circuits and Syst. for Video Technology, Vol. 15, No. 2, pp. 296-305, 2005.
[3] Lu, S., King, I., and Lyu, M. R., “A novel video summarization framework for document preparation and archival applications,” Proc. IEEE Aerospace Conf., Big Sky, MT, USA, 2005.
[4] Taskiran, C. M., Pizlo, Z., Amir, A., Ponceleon, D., and Delp, E. J., “Automated video program summarization using speech transcripts,” IEEE Trans. Multimedia, Vol. 8, No. 4, pp. 775-791, 2006.
[5] Ma, Y.-F., Lu, L., Zhang, H.-J., and Li, M., “A user attention model for video summarization,” Proc. 10th ACM Int. Conf. Multimedia, Juan-les-Pins, France, 2002.
[6] Li, Y., Lee, S.-H., Yeh, C.-H., and Kuo, C.-C. J., “Techniques for movie content analysis and skimming: tutorial and overview on video abstraction techniques,” IEEE Signal Processing Magazine, Vol. 23, No. 2, pp. 79-89, 2006.
[7] Ma, Y.-F., Hua, X.-S., Lu, L., and Zhang, H.-J., “A generic framework of user attention model and its application in video summarization,” IEEE Trans. Multimedia, Vol. 7, No. 5, pp. 907-919, 2005.
[8] Peng, Y., and Ngo, C.-W., “Clip-based similarity measure for query-dependent clip retrieval and video summarization,” IEEE Trans. Circuits and Syst. for Video Technology, Vol. 16, No. 5, pp. 612-627, 2006.
[9] You, J., Liu, G., Sun, L., and Li, H., “A multiple visual models based perceptive analysis framework for multilevel video summarization,” IEEE Trans. Circuits and Syst. for Video Technology, Vol. 17, No. 3, pp. 273-285, 2007.
[10] Zhu, X., Fan, J., Elmagarmid, A. K., and Wu, X., 2003, “Hierarchical video content description and summarization using unified semantic and visual similarity,” Multimedia Syst., Vol. 9, No. 1, pp. 31-53, 2003.
[11] H.J. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic Partitioning of Full-motion Vide, ”ACM Multimedia Syst., vol. 1, no. 1, pp. 10-28, 1993.
[12] R. Zabih, J. Miller, and K. Mai, “A feature-based algorithm for detecting and classifying production effects,” ACM Multimedia Syst., vol. 7, no. 2, pp. 189-200, 1995.
[13] Z. Cernekova, I. Pitas, and C. Nikou, “Information theory-based shot cut/fade detection and video summarization,” IEEE Trans. on Circuits and Syst. for Video Technology, vol. 16, pp. 82-91, 2006.
[14] Z. Rasheed, and M. Shah, “Scene Detection In Hollywood Movies and TV shows,” In Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Recog., vol. 2, pp.343-348, 2003.
[15] T. Liu, H.J. Zhang, and F. Qi, “A novel video key-frame extraction algorithm based on perceived motion energy model,” IEEE Trans. on Circuits and Syst. for Video Technology, vol. 13, no. 10, pp. 1006-1013, Oct. 2003.
[16] C.W. Ngo, Y.F. Ma and H.J. Zhang, “Video Summarization and Scene Detection by Graph Modeling,” IEEE Trans. on Circuits and Syst. for Video Technology, vol. 15, no. 2, pp. 296-305, 2005.
[17] Y.F Ma, and H.J. Zhang, “A Model of Motion Attention for Video Skimming,” In Proc. of Int. Conf. on Image Process, vol.1, pp. 129-132, 2002.
[18] Z. Li, G.M. Schuster, A.K. Katsaggelos, and B. Gandhi, “Rate-Distortion Optimal Video Summary Generation,” IEEE Trans. on Image Processing, vol. 14, no. 10, pp. 1550-1560, 2005.
[19] H.B. Kang, “A hierarchical approach to scene segmentation,” Content-Based Access of Image and Video Libraries, pp.65-71, 2001.
[20] T. Kanungo, D.M. Mount, N.S. Netanyahu, C.D. Piatko, R. Silverman and A.Y. Wu, “An Efficient K-Means Clustering Algorithm : Analysis and Implementation,” IEEE Trans. on Pattern Analysis and Machine Int., vol. 24, no. 7, pp.881- 892, 2002.
[21] A. Hanjalic, and H.J. Zhang, “An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis,” IEEE Trans. on Circuits and Syst. for Video Technology, vol. 9, no. 8, pp. 1280-1289, 1999.
[22] Y. Zhao, T. Wang, P. Wang, and Y. Du, “Scene segmentation and categorization using NCuts,” In Proc. of IEEE Computer Society Conf. on Computer Vision and Pattern Rec.., pp. 1-7, 2007.
[23] J.B. Shi, and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. on Pattern Analysis and Machine Int., vol. 22, no. 8, pp. 888-905, 2000.
[24] X. Zhu, A. K. Elmagarmid, X. Xue, L. Wu, and Ann Christine Catlin, “InsightVideo: Toward Hierarchical Video Content Organization for Efficient Browsing, Summarization and Retrieval,” IEEE Trans. Multimedia , vol. 7, no. 4, pp. 648~665, Aug. 2005.
[25] M.M. Yeung and B.L. Yeo, “Time-constrained clustering for segmentation of video into story units,” In Proc. of International Conf. on Pattern Recog., vol. 3, pp. 375-380, 1996.
[26] Chih-Wen Su, Hong-Yuan Liao, Hsiao-Rong Tyan, and Kuo-Chin Fan, ”Motion Flow-Based Video Retrieval,” IEEE Trans. on Multimedia, Vol. 9, No. 6, pp. 1193-1201, 2007.
[27] R. Wang and T. Huang, “Fast camera motion analysis in MPEG domain,” in Proc. Int. Conf. Image Processing (ICIP), Oct. 1999, vol. 3,pp. 691–694.
[28] Steven C. H. Hoi, Michael R. Lyu, “A Multimodal and Multilevel Ranking Scheme for Large-Scale Video Retrieval,” IEEE Trans. on Multimedia, Vol. 10, No. 4, pp. 607-619, 2008.
[29] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-based image retrieval at the end of the early years,” IEEE Trans. Pattern Anal. Machine Int., vol. 22, no. 12, pp. 1349–1380, 2000.
[30] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-supervised learning using gaussian fields and harmonic functions,” in Proc. 20th Int. Conf. Machine Learning (ICML’03), Washington, DC, 2003.
[31] S. C. H. Hoi and M. R. Lyu, “A semi-supervised active learning framework for image retrieval,” in Proc. IEEE Conf. Computer Vision and Pattern Recog.(CVPR 2005), San Diego, CA, 2005.
[32] S.Y. Huang, C.Y. Cho, and J.S. Wang, “Adaptive Fast Block-Matching Algorithm by Switching Search Patterns for Sequences With Wide-Range Motion Content,” IEEE Trans. on Circuits and Sys. for Video Technology, vol. 15, no. 11, pp.1373-1384, 2005.
[33] S. Zhu and K.K. Ma, “A New Diamond Search Algorithm for Fast Block Matching Motion Estimation,” IEEE Trans. on Image Proc., vol. 9, no. 2, pp. 287- 290, 2000.
[34] Arnaud E. Jacquin, “A Fractal Theory of Iterated Markov Operators with Applications to Digital Image Coding,” Georgia Institute of Technology, 1989.
[35] G. Vine, “Signal Modeling With Iterated Function System,” Georgia Institute of Technology, Atlanat, GA, 1993.
[36] G. Vines and M. H. Hayes, “Nonlinear interpolation in a one-dimensional fractal model,” in Proc. of the fifth Digital Signal Proc. Workshop, Pp. 872-877, Sep., 1992.
[37] G. Vines and M.H. Hayes, “Nonlinear Address Maps in a One-Dimensional Fractal Model,” IEEE Trans. On Signal Proc., Apr., 1993.
[38] C.-W. Su, H.-Y. M. Liao, H.-R. Tyan, K.-C. Fan, and L.-H. Chen, ”A motion-tolerant dissolve detection algorithm,” IEEE Trans. Multimedia, vol. 7, no. 6, pp. 1106–1113, Dec. 2005.
[39] C.-C. Shih, H.-R. Tyan, and H.-Y. M. Liao, “Shot change detection based on the Reynolds transport theorem,” Lecture Notes in Computer Science, vol. 2195, pp. 819–824, 2001.
[40] R. Cutler and M. Turk, “View-based interpretation of real-time optical flow for gesture recognition,” in Proc. IEEE International Conf. Automatic Face and Gesture Recog., pp. 416–421, 1998.
[41] B. Maurin, O. Masoud, and N. Papanikolopoulos, “Monitoring crowded traffic scenes,” in Proc. IEEE Int. Conf. Int. Transportation Systems, Singapore, pp. 19–24, 2002.
[42] W. Hu, D. Xie, Z. Fu, W. Zeng, and S. Maybank, “Semantic-Based Surveillance Video Retrieval,” IEEE Trans. Image Processing, Vol. 16, No. 4, Apr. 2007.
[43] Li Gao and Zhu Li, “An Efficient Video Indexing and Retrieval Algorithm Using the Luminance Field Trajectory Modeling,” IEEE Trans. on circuits and syst. for video technology, vol. 19, no. 10, p.1566~1570, Oct. 2009.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外均不公開 not available
開放時間 Available:
校內 Campus:永不公開 not available
校外 Off-campus:永不公開 not available

您的 IP(校外) 位址是 3.227.239.160
論文開放下載的時間是 校外不公開

Your IP address is 3.227.239.160
This thesis will be available to you on Indicate off-campus access is not available.

紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code