國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,動作識別演算法在Cell處理器架構上之優化設計 ,The Optimal Design for Action Recognition Algorithm on Cell Processor Architecture

論文名稱 Title	動作識別演算法在Cell處理器架構上之優化設計 The Optimal Design for Action Recognition Algorithm on Cell Processor Architecture
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	99 學年度第 2 學期 The spring semester of Academic Year 99	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	90
研究生 Author	潘柏勳 Po-Hsun Pan
指導教授 Advisor	邱日清 Jih-Ching Chiu
召集委員 Convenor	鍾崇斌 Chung-Ping Chung
口試委員 Advisory Committee	葉家宏, 謝萬雲 Chia-Hung Yeh; Wann-Yun Shieh
口試日期 Date of Exam	2011-07-19	繳交日期 Date of Submission	2011-08-23
關鍵字 Keywords	動作識別演算法、SIMD、CELL、多核心、平行化 action recognition, SIMD, CELL, parallelize
統計 Statistics	本論文已被瀏覽 5675 次，被下載 805 次 The thesis/dissertation has been browsed 5675 times, has been downloaded 805 times.

中文摘要
近幾年來，動作識別已經廣泛的被應用在計算機視覺以及影像處理的領域，舉凡在居家照護、人身財物的保障、國土安全上，以視訊自動化來識別人類行為以達到監控的目的，都有很大的幫助。要實現動作識別功能，有許\多需要考慮的因素，主要是準確度與即時性。若能以平行化的方式進行動作識別演算法的運算，對演算法的即時處理能力會有很大的提升。為了達到即時性的需求，我們研究如何在IBM CELL B.E.多核心平台上實現動作識別演算法的並行化。我們設計出的動作識別演算法與單核心下未加速的程式效能比較增進了231倍。我們發現在動作識別演算法中，有許多區塊重複地進行相同的運算，此類型的運算可以被並行處理，並利用單指令多資料的架構來加速。在動作識別演算法中，有四個主要的演算法，DMASKS、HMHHb、MGD、SVM。CELL B.E.平台上的SIMD指令一次可以對128 bits的資料量作運算，在做DMASKS時，SIMD的並行度可到達16倍，HMHHb的並行度可達128倍，MGD的並行度可達8倍，運算SVM時的並行度可達4倍。我們依據CELL B.E.的加速機制，完成具有多執行緒 (Multi-threading) 並結合多並行資料流 (Multiple streaming) 的高效能運算模式。研究的結果顯示動作識別演算法非常適合在多核心系統上以SIMD架構來並行化處理。動作識別演算法並行化的結果可以更即時的將動作反應出來。有了即時性的優點後，未來可期加入更多複雜的演算法來提升其準確率，達到即時性與準確率兼備的效果。
Abstract
In recent years, automatic human action recognition has been widely researched within the computer vision and image processing communities. To identify human behavior which achieve the surveillance has great help by video automation in aspect of home caring, personal property and homeland security. To achieve action recognition, there are many factors to be considered, primarily the accuracy and real-time. If we can parallelize the action recognition algorithm, it will be a greatly improvement to the real-time processing capability of the algorithm. To achieve real-time demand, we study how to implement action recognition algorithm parallelization in the CELL B.E. platform. The action recognition algorithm with our design is faster than the original algorithm; it has 231 times speed up. We found that in the action recognition algorithm, there are many repeated operation between blocks, it can be parallelize by using single-instruction multiple-data architecture. In the action recognition algorithms, there are four major algorithms, DMASKS, HMHHb, MGD, SVM. The SIMD instructions in CELL B.E. platform can compute 128 bits data at once. While doing DMASKS, SIMD parallelism can reach 16 times, HMHHb parallelism up to 128 times, MGD parallelism up to 8 times, and SVM can reach 4 times. Based on CELL B.E. acceleration mechanism, we achieve high-performance computing models with multi-threading and multiple streaming. Our study showed that the action recognition algorithm is very suitable for multi-core system with parallel processing SIMD architecture. The parallelization for action recognition algorithm will have more immediate response in identifying human action. With the advantages of real-time, it can be expected to include more complex algorithms for the accuracy of algorithm in the future, to achieve both immediacy and accuracy.

目次 Table of Contents
誌謝 I 摘要 II ABSTRACT III 目錄 IV 圖目錄 VII 表目錄 XI 第一章簡介 1 1.1 研究動機 1 1.2 研究目的 2 1.3 論文架構 2 第二章相關研究 3 2.1 CELL B.E.處理器架構 3 2.1.1 PPE 5 2.1.2 SPE 5 2.1.3 EIB 7 2.2 CUDA架構研究 10 2.3 動作識別演算法研究 12 2.3.1 MGD特徵動作識別演算法研究 12 2.3.2 Support Vector Machine支持向量機 18 2.4 CELL B.E.平台上運算加速的方法 25 2.4.1 以多顆SPE平行運算來加速 25 2.4.2 以SIMD的方式進行運算 27 2.4.3 Multiple Buffering 28 2.5相關SIMD 指令集研究(MMX、SSE 與WMMX) 30 2.5.1 MMX 30 2.5.2 SSE 31 2.5.3 WMMX 32 第三章 CELL平台上動作識別演算法實現 33 3.1以8顆SPE平行運算進行加速 33 3.2 以SIMD運算進行加速 35 3.2.1 DMASKS演算法實現 35 3.2.2 HMHHb演算法實現 38 3.2.3 MGD演算法實現 43 3.2.4 SVM演算法實現 48 3.3 以MULTIPLE BUFFERING進行加速 50 第四章模擬與分析 52 4.1 模擬平台介紹 52 4.2 SVM模擬環境實現 53 4.2.1測試影片資料庫 53 4.2.2 SVM實現 54 4.3 MGD動作識別的實現 56 4.4各加速機制之模擬結果 58 4.4.1 以單顆SPE來執行程式 58 4.4.2 以8顆SPE來執行程式 59 4.4.3 以8顆SPE並以SIMD方式來執行程式 59 4.4.4 Multiple Buffering 60 4.4.5 效能分析 63 4.5 與MMX，SSE，WMMX之效能比較 66 4.5.1 DMASKS 66 4.5.2 HMHHb 67 4.5.3 MGD萃取 69 4.5.4 SVM分類函數 70 4.5.5 效能分析 71 4.6 總結 73 第五章結論 74 參考文獻 75

參考文獻 References
[1]Hongying Meng, Nick Pears, Chris Bailey, “A Human Action Recognition System for Embedded Computer Vision Application”, IEEE Conference on Computer Vision and Pattern Recognition Minneapolis, MN, USA , June 2007. [2]Hongying Meng , Michael Freeman, Nick Pears, Chris Bailey, “Real-time human action recognition on an embedded, reconfigurable video processing architecture”, Journal of Real-Time Image Processing , pp. 163-176, Sep. 2008 . [3]http://www.ibm.com/developerworks/power/cell/ [4]http://www.research.ibm.com/cell/heterogeneousCMP.html [5]Jim Kahle, “Cell Architecture”, IBM Fellow [6]J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, D. Shippy, “Introduction to the Cell Multiprocessor”, IBM Journal of Research and Development, Vol. 49, No. 4/5, July/Sept. 2005. [7]Peter Hofstee, “Cell Today and Tomorrow”, Ph.D., Cell Chief Scientist and Chief Architect [8]IBM, “Synergistic Processor Unit Instruction Set Architecture”, Version 1.2 [9]IBM, Redbooks, “Programming the Cell Broadband Engine™ Architecture: Examples and Best Practices”, published on 8 August 2008 [10]Michael Gschwind, H. Peter Hofstee, Brian Flachs, Martin Hopkins, IBM ,Yukio Watanabe, Toshiba ,Takeshi Yamazaki, Sony Computer Entertainment, “SYNERGISTIC PROCESSING IN CELL’S MULTICORE ARCHITECTURE” , IEEE Computer Society, pp. 10-24, March-April 2006 [11]IBM ,”Cell Broadband Engine Architecture”, Version 1.02 [12]J. W. Davis, “Hierarchical motion history images for recognizing human motion”, In IEEE Workshop on Detection and Recognition of Events in Video, pp. 39–46, 2001. [13]Meng, N. Pears, and C. Bailey, “Recognizing human actions based on motion information and SVM”, International Conference on Intelligent Environments, pp 239–245, 2006. [14]Aaron F. Bobick, James W. Davis, “The recognition of human movement using temporal templates”, IEEE Trans. Pattern Anal. Mach. Intel. ,pp. 257-261, Mach 2001. [15]David A. Bader and Virat Agarwal “FFTC: Fastest Fourier Transform for the IBM Cell Broadband Engine”, Springer-Verlag LNCS 4873, pp. 172-184, Goa, India, December 18-21, 2007. [16]Stefano Tommesani, Intel MMX Instruction Set, DOI=http://www.tommesani.com/MMXPrimer.html. [17]Intel Corp, “MMXT Technology Manuals and Application Notes”, June 2009,DOI=http://software.intel.com/en-us/articles/mmxt-technology-manu als-and-application-notes/ [18]Stefano Tommesani, Intel SSE2 Instruction Set, DOI=http://www.tommesani.com/SSE2Intro.html. [19]Intel Corp, “IntelR Wireless MMXTM Technology Developer Guide” , Aug. 2002. [20]Chih-Chung Chang , Chih-Jen Lin, “A Practical Guide to Support Vector Classification ”, May 2009. [21]C. Schuldt, I. Laptev, and B. Caputo, “Recognizing Human Actions : A Local SVM Approach”, Computational Vision and Active Perception Laboratory , Cambridge, UK. 2004. [22]Chih-Chung Chang , Chih-Jen Lin ,“LIBSVM: a Library for Support Vector Machines” , Feb. 2009. [23]Hsieh-Chung Chen, Chen-Mou Cheng, Shih-Hao Hung, Zong-Cing Lin ”Integer Number Crunching on the Cell Processor”, International Conference on Parallel Processing , 39th International Conference, pp. 508-515, 2010 [24]Jih-ching Chiu, Tzu-Chun Lin, Yu-Liang Chou, “Implementation of Action Recognition Algorithm on Multiple-Streaming Multimedia”, 2010 International Computer Symposium, pp. 804-808, Dec. 2010 [25]Nvidia CUDA ,“NVIDIA CUDA C Programming Guide ” , Version4.0 , May,2011

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：校內校外完全公開 unrestricted 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0823111-143005.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS