Responsive image
博碩士論文 etd-0824111-172556 詳細資訊
Title page for etd-0824111-172556
論文名稱
Title
人臉辨識演算法在Cell處理器架構上之優化設計
The Optimal Design for Face Detection Algorithm on Cell Processor Architecture
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
73
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2011-07-19
繳交日期
Date of Submission
2011-08-24
關鍵字
Keywords
SIMD、異質型 (Heterogeneous)、Modified Census Transform (MCT)、Multiple Buffering、PowerPC Processor Element (PPE)、Synergistic Processor (SPE)
Multiple Buffering, Modified Census Transform (MCT), SIMD, Synergistic Processor (SPE), Heterogeneous, PowerPC Processor Element (PPE)
統計
Statistics
本論文已被瀏覽 5632 次,被下載 1253
The thesis/dissertation has been browsed 5632 times, has been downloaded 1253 times.
中文摘要
由於當前人臉辨識技術的應用相當廣泛,在日常生活中逐漸扮演著重要的角色,應用如大廈門禁、機場出入境、攝像監控系統等人臉偵測與辨識。考慮設置人臉辨識系統是為了改善工作的效率以及節省人事成本的開銷,以上歸納出人臉辨識系統應有低成本、具支援多媒體效能與設置簡易為方針之即時處理平台,我們選擇兼顧以上優勢之IBM CELL多核心處理器平台做為開發的基礎。對於辨識準確度的要求,我們選擇在單純環境中擁有準確與資料規則性排列的演算法做為實現並行演算法的基礎,實現演算法的方式可以分為兩個部分: 圖形Modified Census Transform (MCT)之演算法及計算Hypotheses標記人臉演算法,其中MCT演算法所用到多點的平均值,利於使用大量的平行運算,另一部分計算Hypotheses標記人臉演算法著重於資料的讀取較不易於平行化之加速。本論文使用內置IBM CELL處理器架構的PS3 (PlayStaion3)平台做為人臉辨識的實現,IBM CELL處理器架構具有一顆PowerPC Processor Element (PPE)與八顆Synergistic Processor (SPE)所組成之異質型 (Heterogeneous) 的多核心系統,且可以在thread-level與data-level進行高度的平行化處理,此外還具備快速及多功能之系統連接網路,以解決大量平行化對資料頻寬的需求。我們使用此平台具備的加速機制,以多顆處理器、SIMD、Multiple Buffering與減少Branch等方法優化人臉辨識之演算法,模擬結果為單顆SPE執行效率之24倍,未來即可放入更多修正辨識準確度相關之演算法提升即時人臉辨識率。
Abstract
With the advance of facial recognition technology, many related applications such as the clearance of specific facilities, air port security, video camera surveillance, and personnel recognition. To maximize working efficiency and reduce human resource, the platform used for facial recognition should possess both low cost, multimedia performance, and the ease of use. Among the list of available platforms, a IBM CELL multi-core based platform that features the aforementioned advantages is used to manifest our work. To meet the demand of recognition accuracy, a recognition algorithms using features low error rate and regular data patterns are adopted. These algorithms are carried out in two parts: Modified Census Transform (MCT) and hypotheses of human facial calculation. The multi-point average value required by the MCT is obtained through parallel processing, and potential improvement in recognition efficiency is possible if wider data paths are used. A PlayStation 3 (PS3) platform equipped with the IBM CELL multi-core processor is used in this thesis. The IBM CELL multi-core processor consists of a PowerPC Processor Element (PPE) and 8 Synergistic Processor (SPE), which forms a heterogeneous multi-core system. This system is capable of parallelizing thread-level and data-level data words, which can meet the demand of high data bandwidth and data parallelization. By using this platform to accelerate the processing of facial recognition, simulation results suggest that the execution efficiency is improved by 24 times when compared with a single core SPE. The simulation also reveals that the use of parallelization of processing facial recognition data feasible. In the future, improved algorithms can be applied to improve the accuracy of facial recognition.
目次 Table of Contents
摘要 I
目錄 III
圖次 V
表次 VIII
第一章 簡介 1
1-1研究動機 1
1-2 研究目的 2
1-3 論文架構 2
第二章 相關研究 3
2-1 Adaptive Boosting 4
2-2 Face Detection 流程 5
2-3 Cell B.E.處理器架構 9
2-3-1 PPE (PowerPC Processing Element) 11
2-3-2 SPE (Synergistic Processing Element) 13
2-3-3 EIB (Element Interconnect Bus) 15
2-4 Cell B.E.平台上SPE的Instruction Set 17
2-5 Cell B.E.平台上運算加速的方法 23
2-5-1 以多顆SPE平行運算來加速 24
2-5-2 以SIMD的方式進行運算 25
2-5-3 Multiple Buffering 26
2-5-4 減少Branch的指令數 27
第三章 人臉辨識在 Cell平台上之設計方法 28
3-1人臉辨識流程 28
3-1-1 計算八點平均 28
3-1-2 標記MCT值 29
3-1-3 標記人臉位置 30
3-2以8顆SPE平行運算進行加速 31
3-3以SIMD運算進行八點平均之運算加速 34
3-4減少Branch指令數進行加速 40
3-5以Multiple Buffering 進行計算八點平均與MCT編號之運算加速 43
3-6以Multiple Buffering 進行人臉標記的運算加速 44
第四章 模擬與分析 46
4-1模擬平台介紹 46
4-2各種加速機制的模擬結果 47
4-2-1單顆SPE執行程式的模擬結果 47
4-2-2八顆SPE執行程式的模擬結果 49
4-2-3以八顆SPE執行SIMD並行程式且減少Branch指令的模擬結果 50
4-2-4以八顆SPE執行SIMD並行程式且減少Branch指令再加上Multiple Buffering DMA傳輸的模擬結果 52
4-2-5效能分析與比較 57
第五章 結論 59
參考文獻 60



參考文獻 References
[1]http://www.ibm.com/developerworks/power/cell/
[2]http://www.research.ibm.com/cell/heterogeneousCMP.html
[3]Jim Kahle, Cell Architecture, IBM Fellow
[4]J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, D. Shippy, "Introduction to the Cell Multiprocessor," IBM Journal of Research and Development, Vol. 49, No. 4/5, July/Sept. 2005.
[5]Peter Hofstee, Cell Today and Tomorrow, Ph.D., Cell Chief Scientist and Chief Architect
[6]Synergistic Processor Unit Instruction Set Architecture Version 1.2 ,IBM
[7]SPE Runtime Management Library ,IBM
[8]SPE Runtime Management Library Version 1 to Version 2 Migration Guide ,IBM
[9]Redbooks, Programming the Cell Broadband Engine™ Architecture: Examples and Best Practices, published on 8 August 2008
[10]Handbook, Cell Broadband Engine Programming Handbook , Version 1.1,IBM, published on 24 April 2007
[11]Michael Gschwind, H. Peter Hofstee, Brian Flachs, Martin Hopkins,IBM ,Yukio Watanabe, Toshiba ,Takeshi Yamazaki, Sony Computer Entertainment, “SYNERGISTIC PROCESSING IN CELL’S MULTICORE ARCHITECTURE” , published by the IEEE Computer Society
[12] SPU Assembly Language Specification ,Version 1.7 ,IBM, published on 18 July 2008
[13]C/C++Language Extensions for Cell Broadband Engine Architecture , Version 2.6,IBM, published on 25 August 2008
[14]Daniele Paolo Scarpazza, Gregory F. Russell, ” High-performance regular expression scanning on the Cell/B.E. processor” , Proceedings of the 23rd international conference on Supercomputing (ICS) ,pp.14-25
[15]Tao Liu, Haibo Lin, Tong Chen, John Kevin O'Brien, Ling Shao, “DBDB: optimizing DMA transfer for the Cell BE Architecture”, Proceedings of the 23rd international conference on Supercomputing (ICS) ,pp.36-45
[16]Bernhard Froba and Andreas Ernst, “Face Detection with The Modified Census Transform”, Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 1-6, 2004.
[17]Derek Hoiem , Adaboost , Available: http://www.cs.uiuc.edu/homes/dhoiem/presentations/Adaboost_Tutorial.ppt
[18]Hsieh-Chung Chen and Chen-Mou Cheng,Shih-Hao Hung and Zong-Cing Lin “Interger Number Crunching ont the Cell Processor”, Proceedings of 39th International Conference on Parallel Processing (ICPP),pp.508-515,2010.
[19]J.Barhen and T.Humble , P.Mitra and M. Traweek “Multi-FFT Vectorization for the Cell Multicore Processor,” in Proceedings of the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing,2010,pp.780-785
[20] H. Rowley, S. Baluja, and T. Kanade. “ Neural network-based face detection”. In IEEE Patt. Anal. Mach. Intell., volume 20,pages 22–38, 1998.
[21]Jih-ChingChiu , Ta-Li Yeh, Cheng-Han Liu, "The Optimal Design for Motion Estimation Algorithm on Cell Processor Architecture," Workshop on Parallel and Distributed Computing (PD), 2009 National Computer Symposium (NCS 2009), Taipei, Taiwan, pp. 210-221, Nov. 2009.
[22]Cell Broadband Engine Architecture,Version 1.02 ,IBM
Thomas Chen ,Ram Raghavan, Jason Dale, Systems Performance, IBM, Software Group ,and Eiji Iwata, Microprocessor Development, Sony Computer Entertainment Inc.Available: http://www.ibm.com/developerworks/power/library/pa-cellperf/
[23]Power Architecture editors, developerWorks, IBM.Available: http://www.ibm.com/developerworks/power/library/pa-expert9/#resources
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:校內校外完全公開 unrestricted
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code