國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,提升編碼效能之H.264/AVC與可調式視訊編碼快速模式決策機制,Fast Mode Decision Mechanism for Coding Efficiency Improvement in H.264/AVC and SVC

論文名稱 Title	提升編碼效能之H.264/AVC與可調式視訊編碼快速模式決策機制 Fast Mode Decision Mechanism for Coding Efficiency Improvement in H.264/AVC and SVC
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	97 學年度第 2 學期 The spring semester of Academic Year 97	語文別 Language	英文 English
學位類別 Degree	碩士 Master	頁數 Number of pages	126
研究生 Author	周伯胤 Bo-Yin Chou
指導教授 Advisor	葉家宏 Chia-Hung Yeh
召集委員 Convenor	李明穗 Ming-Sui Lee
口試委員 Advisory Committee	周孜燦, 謝君偉, 張敏寬 Zi-Tsan Chou; Jun-Wei Hsieh; Min-Kuan C. Chang
口試日期 Date of Exam	2009-07-06	繳交日期 Date of Submission	2009-08-04
關鍵字 Keywords	可調式視訊編碼、快速模式決策、視訊編碼、編碼區塊樣式、H.264/進階視訊編碼 fast mode decision, video coding, CBP, SVC, H.264/AVC
統計 Statistics	本論文已被瀏覽 5680 次，被下載 908 次 The thesis/dissertation has been browsed 5680 times, has been downloaded 908 times.

中文摘要
為了加速H.264/進階視訊編碼和可調式視訊編碼之編碼過程，本論文提出了基於時間域與空間域相關性之合併分裂快速模式決策演算法和編碼區塊樣式快速模式決策演算法。而基於時間域與空間域相關性之合併分裂快速模式決策演算法和編碼區塊樣式快速模式決策演算法分別應用於H.264/進階視訊編碼和可調式視訊編碼之中。基於時間域與空間域相關性之合併分裂快速模式決策演算法使用時間域相關性預測每個8×8方塊的動量向量，並且藉由合併分裂過程預測出其餘方塊之動量向量，最後，再藉由空間域相關性取代傳統合併方式以加速16×16方塊的合併過程。編碼區塊樣式是區塊標頭檔中用以標示該區塊是否有殘值資訊的參數。本論文所提出的編碼區塊樣式快速模式決策演算法可藉由可調式視訊編碼增強層中相鄰區塊的編碼區塊樣式值與相對應基礎層的區塊模式，減少增強層目前區塊中需測試的方塊模式。實驗結果顯示本論文所提出的演算法與JM 12.3，JSVM 9.12和其他演算法相比，皆可在PSNR微幅下降與位元率些許上升的情況下，大幅降低編碼端之計算量。
Abstract
In order to speedup the encoding process of H.264/AVC and Scalable Video Coding (SVC), Temporal and Spatial Correlation-based Merging and Splitting (TSCMS) fast mode decision algorithm and Coded Block Pattern (CBP)-based fast mode decision algorithm are proposed in this thesis. TSCMS and CBP-based fast mode decision algorithms are applied to H.264/AVC and SVC, respectively. In TSCMS, Temporal Correlation (TC) is used to predict the Motion Vectors (MVs) of 8×8 blocks in each macroblock. In addition, the merging and splitting procedure is adopted to predict the motion vectors of other blocks. Afterwards, the spatial correlation is performed to merge 16×16 blocks instead of the conventional merge scheme. CBP value is the syntax used at each Macroblock (MB) header to indicate whether an MB contains residual information or not in CBP-based fast mode decision algorithm. The proposed algorithm can exclude the invalid modes for the mode prediction of the current MB in Enhancement Layer (EL) through the CBP values and MB modes of adjacent MBs in EL and the co-located Base Layer (BL) MB modes. Experimental results show that the proposed algorithms reduce computations significantly with negligible PSNR degradation and bit increase when compared to JM 12.3, JSVM 9.12, and the other existing methods.

目次 Table of Contents
CHAPTER 1 Introduction…………………………………………………………...1 1.1 Overview of Video Coding…………………………………………………1 1.2 Overview of H.264/AVC Video Coding Standard……………………...4 1.2.1 Discrete Cosine Transform (DCT) …………………………………...6 1.2.2 Variable Block Size…………………………………………………...8 1.2.3 Multiple Reference Frames…………………………………………10 1.3 Motivation………………………………………………………………...11 1.4 The Organization of the Thesis…………………………………………..13 CHAPTER 2 Overview of Scalable Video Coding and Relevant Work…………..14 2.1 Background of Scalable Video Coding…….……………………………14 2.1.1 Spatial Scalability……………………………...…………………16 2.1.2 Temporal Scalability…………………................…………………17 2.1.3 Quality Scalability……………………………....…………………18 2.2 Inter-layer Prediction……………………….……………………………20 2.2.1 Inter-layer Motion Prediction…………………...…………………20 2.2.2 Inter-layer Intra Prediction…………………………………..……22 2.2.3 Inter-layer Residual Prediction……………………………….……23 2.3 Difference between SVC and H.264/AVC………………...……………24 2.4 Rate-distortion Performance of SVC and H.264/AVC...……………26 2.5 Previous Works in H.264/AVC and Scalable Video Coding.………..……29 2.5.1 Using H.264 Coded Block Patterns for Fast Inter-Mode Selection [22]………………………………………………………………...30 2.5.2 Layer-Adaptive Mode Decision and Motion Search for Scalable Video Coding with Combined Coarse Granular Scalability (CGS) and Temporal Scalability [23] ………………………………….…35 CHAPTER 3 Proposed Temporal and Spatial Correlation-based Merging and Splitting Fast Mode Decision Algorithm in H.264/AVC……............40 3.1 Background of Merging and Splitting Procedure........................................40 3.2 Proposed Algorithm.....................................................................................43 3.2.1 Temporal Correlation……………………………………………43 3.2.2 Spatial Correlation…………………………………………………45 3.2.3 Temporal and Spatial Correlation-based Merging and Splitting Fast Mode Decision Algorithm……………………………………47 CHAPTER 4 Proposed CBP-based Fast Mode Decision Algorithm in SVC...........50 4.1 The Analysis of CBP Characteristics.........................................................50 4.2 Analysis of the Largest Temporal Level Information................................54 4.3 CBP-based Fast Mode Decision Algorithm……........................................56 4.3.1 CBP-based Fast Mode Decision…………………………………56 4.3.2 Temporal Relativity Mode Selection Method……………………59 CHAPTER 5 Experimental Results……………….……………………………….61 5.1 Testing Platform of Experimental Results.……………...................61 5.2 Objective Measurement………………….…………………...................63 5.3 Experimental Results of TSCMS Fast Mode Decision Algorithm……....65 5.4 Experimental Results of CBP-Based Fast Mode Decision Algorithm…....69 5.4.1 Simulation Results of 2-layer SVC……..…...……………………70 5.4.2 Simulation Results of 4-layer SVC…..……...……………………86 CHAPTER 6 Conclusions and Future Work………………………………………95 6.1 Conclusions…………………………………………………………….....95 6.2 Future Work……………………………………………………………….98 Bibliography………………………………………………………………………….99 Curriculum Vitae………………………………………………………………….104 Publications……………………………….......…………………………………….105

參考文獻 References
[1] K.-N. Ngan, C.-W. Yap and K.-T. Tan, Video Coding for Wireless Communications. New Jersey: Prentice Hall, 2002. [2] A.-M. Tekalp, Digital Video Processing. New Jersey: Prentice Hall PTR, 1995. [3] Y. Wang, J. Ostermann and Y.-Q. Zhang, Video Processing and Communications. New Jersey: Prentice Hall, 2002. [4] M.-T. Sun and A.-R. Reibman, Compressed Video over Networks. New Work: Marcel Dekker, 2001. [5] Video codec for audiovisual services at p×64 kbit/s, CCITT Recommendation H.261, 1990. [6] CCITT SGXV, “Description of reference model 8 (RM8),” Document 525, Working Party XV/4, Specialists Group on Coding for Visual Telephony, 1989. [7] ITU Telecommunication Standardization Sector LBC-95, Study Group 15, Working Party 15/1, Expert’s Group on Very Low Bitrate Visual Telephony, available from from Digital Video Coding Group, Telenor Research and Development; or via http://www.nta.no/brukere/DVC/tmn5, 1998. [8] H. Yu, F. Pan and Z. Lin, “Content adaptive rate control for H.264,” Int. J. of Innovative Computing, Information and Control, vol. 1, no. 4, pp. 685-700, 2005. [9] ISO/IEC CD 11172-2 (MPEG-1 Video), “Information technology—coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbits,” video, 1993. [10] ISO/IEC CD 13818-2-ITU-T H.262 (MPEG-2 Video), “Information technology—generic coding of moving pictures and associated audio information,” video, 1995. [11] ITU-T Recommendation H.264 & ISO/IEC 14496-10 (MPEG-4) AVC. Advance video coding for generic audiovisual services. (version 1: 2003, version 2: 2004, version 3: 2005). [12] T. Wiegand, G. Sullivan, J. Reichel, H. Schwarz and M. Wien, “Joint draft 10 of SVC amendment,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-W201, April 2007. [13] H. Schwarz, D. Marpe and T. Wiegand, “Overview of the scalable video coding extension of the H.264/AVC standard,” IEEE Transactions on Circuits and Systems for Video Technology (CSVT), vol. 17, no. 9, pp.1103-1120, 2007. [14] C. A. Segall and G. J. Sullivan, “Spatial scalability within the H.264/AVC scalable video coding extension,” IEEE Transactions on Circuits and Systems for Video Technology (CSVT), vol. 17, no. 9, pp.1121-1135, 2007. [15] K. De Wolf, D. De Schrijver, S. De Zutter and R. Van de Walle, “Scalable video coding: analysis and coding performance of inter-layer prediction,” in Proceedings of IEEE International Symposium on Signal Processing and Its Applications (ISSPA), pp. 1-4, 2007. [16] J. Vieron, M. Wien, and H. Schwarz, “JSVM 10 software,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-W201, April, 2007. [17] Joint Video Team software JM14.2, December 2008. http://bs.hhi.de/~suehring/tml/download/ [18] Y. K. Tu, J. F Yang, Y. N. Shen and M. T. Sun, “Fast variable-size block motion estimation using merging procedure with an adaptive threshold,” in Proceedings of IEEE International Conference on Multimedia & Expo (ICME), Vol. 2, pp. 789-792, 2003. [19] K.-C. Hou, M.-J.Chen and C.-T.Hsu, “Fast motion estimation by motion vector merging procedure for H.264,” in Proceedings of IEEE International Conference on Multimedia & Expo(ICME), Amsterdam, Netherlands, pp.1444-1447, 2005. [20] Z. Zhou, M.T. Sun, and Y.F. Hsu, “Fast variable block-size motion estimation based on merge and split procedures for H.264/MPEG-4 AVC,” in Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), vol. 3, pp. 725-728, 2004. [21] A. M. Tourapis, O. C. Au and M. L. Liou, “Predictive motion vector field adaptive search technique enhancing block-based motion estimation,” in Proceedings of SPIE Conference on Visual Communication and Image Processing, pp. 883-892, 2001. [22] J. Xin, M.T. Sun and V. Hsu, “Diversity-based fast block motion estimation,” in Proceedings of IEEE International Conference on Multimedia & Expo (ICME), pp.525-528, 2003. [23] H. Li, Z.-G. Li, and C. Wen, “Fast mode decision for coarse grain SNR scalable video coding,” in Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), pp. 545-548, 2006. [24] H. Li, Z.-G. Li, and C. Wen, “Fast mode decision for spatial scalable video coding,” in Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), pp. 3005-3008, 2006. [25] H. Li, Z.-G. Li, and C. Wen, “Fast mode decision algorithm for inter-frame coding in fully scalable video coding,” IEEE Transactions on Circuits and Systems for Video Technology (CSVT), vol. 16, no. 7, pp.889-895, 2006. [26] Q. Dai, D. Zhu, and R. Ding, “Fast mode decision for inter prediction in H.264,” in Proceedings of IEEE International Conference on Image Processing (ICIP), pp.119-122, 2004. [27] B.-Y. Chen and S.-H Yang, “Using H.264 coded block patterns for fast inter-mode selection,” in Proceedings of IEEE International Conference on Multimedia and Expo (ICME), pp. 721-724, 2008. [28] H.-C. Lin, W.-H. Peng, H.-M. Hang, and W.-J. Ho, “Layer-adaptive mode decision and motion search for scalable video coding with combined coarse granular scalability (CGS) and temporal scalability,” in Proceedings of IEEE International Conference on Image Processing (ICIP), pp. 289-292, 2007. [29] J. Lee and B. Jeon, “Fast mode decision for H.264,” in Proceedings of IEEE International Conference on Multimedia and Expo (ICME), pp. 1131-1134, 2004,. [30] S. Zhu and K.K. Ma, “A new diamond search algorithm for fast block-matching motion estimation,” IEEE Transactions on Image Processing, vol. 9, pp. 287-290, 2000.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內外都一年後公開 withheld 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0804109-172546.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS