國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,運用稀疏編碼和卷積神經網路提升高效率視訊編碼效能,Coding Performance Improvement using Sparse Coding and Convolutional Neural Network in High Efficiency Video Coding

論文名稱 Title	運用稀疏編碼和卷積神經網路提升高效率視訊編碼效能 Coding Performance Improvement using Sparse Coding and Convolutional Neural Network in High Efficiency Video Coding
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	105 學年度第 2 學期 The spring semester of Academic Year 105	語文別 Language	英文 English
學位類別 Degree	碩士 Master	頁數 Number of pages	52
研究生 Author	張政騰 Zheng-Teng Zhang
指導教授 Advisor	葉家宏 Chia-Hung Yeh
召集委員 Convenor	范國清 Kuo-Chin Fan
口試委員 Advisory Committee	郭致宏, 鄭文皇, 劉耿豪 Chih-Hung Kuo; Wen-Huang Cheng; Keng-Hao Liu
口試日期 Date of Exam	2017-06-29	繳交日期 Date of Submission	2017-09-11
關鍵字 Keywords	高效率視訊編碼、卷積神經網路、正交匹配追蹤、畫面內編碼、殘余值編碼 High efficiency video coding(HEVC), Convolutional Neural Network, Orthogonal Matching Pursuit, Residual coding, Intra frame coding
統計 Statistics	本論文已被瀏覽 5719 次，被下載 19 次 The thesis/dissertation has been browsed 5719 times, has been downloaded 19 times.

中文摘要
近年來，隨著科技產業的發展，其對高解析度視訊的需求亦在上升。故至至今，仍然值得發展視訊編碼的技術以提升視訊的編碼效能。本文提出兩種方法提升高效率視訊編碼的效能。首先，我們提出一新穎的基於正交匹配追蹤的畫面間殘餘值編碼方法。通過利用正交匹配追蹤去獲取殘餘值得稀疏表達係數從而提升編碼效能。為獲此目的，在畫面內編碼的殘餘值將用於構建一基於紋理複雜度分析的字典。第二種方法，我們在高效率視訊編碼的畫面內編碼中運用卷積神經網路來提升視訊的編碼效能。通過訓練一個基於殘餘值學習機制的卷積神經網路，預測視訊編碼內重建區塊與原始區塊之間的殘餘失真，強化視訊的畫面品質。實驗結果表明本文的方法均能較好的提升視訊編碼效能。
Abstract
In recent years, an increasing requirement of high resolution video can be observed in the development of technological industry. Nowadays, it is deserved to develop video coding technique unremitting to further improve the coding performance of video. This thesis proposes two methods to improve the coding performance in HEVC. First, we propose a new inter-layer residual coding method based on orthogonal matching pursuit (OMP) to obtain the sparse representation vectors as the transform coefficients. To achieve this purpose, a content adaptive dictionary is constructed in I frame based on the analysis of the coding unit complexity. Then, a novel convolutional neural network (CNN) method is proposed for HEVC intra coding. We train an efficient CNN of a residual learning. In intra frame coding, the proposed CNN predicts the residual for reconstructed blocks to enhance its visual quality. Experimental results show that both methods are achieved favorable coding performance.

目次 Table of Contents
Contents 論文審定書 i 誌謝 ii 中文摘要 iii Abstract iv List of Figures vi List of Tables vii Chapter 1 1 1.1 Overview 1 1.2 Motivation 3 1.3 Contribution 4 1.4 Organization 6 Chapter 2 8 2.1 Coding block structure of HEVC 8 2.2 Intra coding 9 2.3 Residual coding in HEVC 10 2.4 Sparse Coding 11 2.5 Convolutional Neural Network 12 Chapter 3 14 3.1 Coding Unit Complexity (CUC) 14 3.2 Residual Dictionary Construction 15 3.3 Orthogonal Matching Pursuit 16 3.4 Rate-Distortion Optimization 18 3.5 Simulation Results 19 Chapter 4 22 4.2 Residual Learning 24 4.3 Proposed CNN Enhancement Mode for HEVC 25 4.4 Early Termination 26 4.6 Experimental Analysis 31 4.6.1 Training sample extraction 31 4.6.2 Results and Analysis 34 Chapter 5 39 Reference 41

參考文獻 References
[1] T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560-576, July 2003. [2] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the high efficiency video coding (HEVC) standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649-1668, Dec. 2012. [3] J. A. Tropp, and A. C. Gilbert, “Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit,” IEEE Transactions on Information Theory, vol. 53, no. 12, pp. 4655-4666, Dec. 2007. [4] Y. Dai, D. Liu, and F. Wu. "A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding." International Conference on Multimedia Modeling. Springer, Cham, pp. 28-39, 2017. [5] W. –S. Park, and M. Kim. "CNN-based in-loop filtering for coding efficiency improvement." Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), pp. 1-5, IEEE, 2016. [6] S. D. Kim, S. J. Hwang, and M. H. Sunwoo, “Novel residual prediction scheme for hybrid video coding,” in Proceedings of IEEE International Conference on Image Processing, pp. 617-620, 2009. [7] C. –H. Yeh, C. –W. Lee, S. –J. F. Jiang, Y. -H. Sung, and W. –J. Huang, “Second order residual prediction for HEVC inter coding,” in proceeding of Asia-Pacific Signal and Information Processing Association (APSIPA), Dec. 2014. [8] J-W. Kang, M. Gabbouj, and C. –C. Kuo, “Sparse/DCT (S/DCT) two-layered representation of prediction residuals for video coding,” IEEE Transactions on Image Processing, Vol. 22, no. 7, pp. 2711-2722, July 2013. [9] C. Dong, C. C. Loy, K. He, and X, “Tang. Learning a Deep Convolutional Network for Image Super-Resolution,” In Proceedings of European Conference on Computer Vision (ECCV), pp. 184-199, 2014. [10] C. Dong, Y. Deng, C. C. Loy, and X. Tang, “Compression artifacts reduction by a deep convolutional network,” In Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 576-584, 2015. [11] Y. Li, D. Liu, H. Li, L, Li, F. Wu, H. Zhang, and H. Yang, “Convolutional Neural Network-Based Block Up-sampling for Intra Frame Coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. pp, no. 99, pp. 1-1, July 2017. [12] Nair, Vinod, and Geoffrey E. Hinton. "Rectified linear units improve restricted boltzmann machines." Proceedings of the 27th international conference on machine learning (ICML-10). 2010. [13] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.770-778, 2016. [14] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner , "Gradient-based learning applied to document recognition." Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998. [15] Microsoft C++ AMP documentation: http://msdn.microsoft.com/en-us/library/hh265137.aspx, {2013-03} [16] HEVC Reference Software HM16.12. (2016 June) [Online]. Available: https://hevc.hhi.fraunhofer.de/trac/hevc/browser/tags/HM-16.12 [17] D.-T. Dang-Nguyen, C. Pasquini, V. Conotter, G. Boato, "RAISE: a raw images dataset for digital image forensics." Proceedings of the 6th ACM Multimedia Systems Conference. ACM, 2015. [18] Vedaldi, Andrea, and K. Lenc, “MatConvNet: Convolutional neural networks for matlab,” Proceedings of the 23rd ACM international conference on Multimedia. ACM, pp. 689-692, 2015. [19] G. Bjontegaard, “Calculation of average PSNR differences between RD-curves,” ITU-T, Austin, Texas, United States, technical report VCEG-M33, Apr. 2001. [20] F. Bossen, “Common HM test conditions and software reference configurations,” JCTVC, Geneva, Switzerland, technical report JCTVC-L1100, Jan. 2013.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0601117-002321.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS