國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,視頻影像之行人辨識跟蹤與計數,Pedestrian Identification, Tracking and Counting in Video Images

論文名稱 Title	視頻影像之行人辨識跟蹤與計數 Pedestrian Identification, Tracking and Counting in Video Images
系所名稱 Department	機械與機電工程學系 Department of Mechanical and Electro-Mechanical Engineering
畢業學年期 Year, semester	104 學年度第 2 學期 The spring semester of Academic Year 104	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	77
研究生 Author	吳一凡 Yi-Fan Wu
指導教授 Advisor	程啟正 Chi-Cheng Cheng
召集委員 Convenor	陳信宏 Hsin-Hung Chen
口試委員 Advisory Committee	周佑誠 Yu-Cheng Chou
口試日期 Date of Exam	2016-07-20	繳交日期 Date of Submission	2016-07-27
關鍵字 Keywords	Kalman濾波、HOG特徵、BLOB匹配法、背景更新、機器視覺 HOG feature, Background update, Machine vision, Kalman filter, BLOB matching method
統計 Statistics	本論文已被瀏覽 5718 次，被下載 221 次 The thesis/dissertation has been browsed 5718 times, has been downloaded 221 times.

中文摘要
摘要人流計數是使用目標檢測、跟蹤技術對一段時間內經過場景的行人個數進行統計。本文根據行人的頭-身軀的整體特徵，提出一種簡單可行的行人跟蹤方法，採用BLOB (Binary Large Objects)匹配法可對每個目標進行標記並持續跟蹤，進而解決遮擋問題。首先採用中值濾波濾噪，改進的混合高斯模型進行背景提取，並通過圖像預處理檢測出運動目標。結合混合高斯模型與背景相減法的應用，比起傳統的方法使前景圖像的空洞現象得到有效緩解，並具備良好的適應性。接著運用HOG (Histogram of Oriented Gradient)特徵和SVM (Support Vector Machine)分類器以辨識行人。為有效縮小搜索範圍，節省處理時間，採用Kalman濾波與BLOB匹配法相結合針對行人的運動軌跡進行預測。當行人接近統計線時通過在預先設定的目標區域中跟蹤行人，減少重疊造成的目標誤判，並可以完成雙向計數功能。行人樣本庫中含有1500個正樣本和12000個負樣本，以及初始分類器判別出錯的錯例有420個也一併放到負樣本中加強其分類能力。經過試驗驗證，本論文所提出之策略對行人的辨識與計數在測試視頻中識別率為90%，平均處理耗時為60 ms。在實際視頻中通過不斷調整分類器，最終可以達到82%的識別率，平均處理耗時為120ms。雖然仍存在一些誤判情形，但漏檢率僅為10%，考慮到實際拍攝時的不確定因素，漏檢率應在可接受範圍之內。本論文所採用的方法可以有效地進行人流計數，也可以自行設定不規則目標區域與統計線，有助於今後針對不同之環境和應用需求。
Abstract
Abstract Pedestrian counting is a way to apply object detection and tracking technology to count the number of pedestrians who enter the area of interest for a period of time. According to the head-body characteristics of pedestrians, this thesis proposes a simple and feasible method for pedestrian tracking based on the BLOB (Binary Large Objects) matching approach, which can achieve the tracking mission by labeling every target and effectively solve the problem of pedestrian occlusion. Firstly, the median filter is employed to remove possible noises, and background is extracted by the improved mixed Gaussian model. Combining the mixed Gaussian model and the background subtraction shows better performance and adaptability compared to the traditional Gaussian model approach. After the moving objects are detected by image preprocessing, the pedestrian can be identified by the HOG (Histogram of Oriented Gradient) features and the SVM (Support Vector Machine) classifier. In order to predict pedestrian’s trajectory, the Kalman filter with the BLOB method are chosen to improve computational efficiency by narrowing the searching region. Tracking pedestrians in the pre-assigned target area is able to reduce misjudgment of objects caused by overlapping. Two-way counting can also be accomplished via pedestrians crossing a given counting line. The person datasets in experimental verification contain 1500 positive samples and 12000 negative samples. 420 hard examples, which bring about wrong discriminate results for the initial classifier, are also added into the negative samples to enhance classification capability. The experimental results on identification and counting of pedestrians for the test video demonstrate 90% successful recognition rate and 60 ms average processing time. In the actual video through the continuous training of the classifier, the final successful recognition rate can reach 82% and the average processing time becomes 120 ms. Although some misjudgments still exist, the missing rate is only 10%, which should be in the acceptable range by taking into account uncertainty in actual shooting environment. The presented method in this thesis can effectively provide function of people counting. The irregular target area and the counting line can be set as the user’s wish. This flexibility will be helpful for different environments and applications in the future.

目次 Table of Contents
摘要 ii Abstract iii 圖次 vii 表次 ix 第一章緒論 1 1.1 研究背景與現實意義 1 1.2 文獻回顧 3 1.3 論文架構 4 第二章影像處理方法概述 6 2.1 影像處理系統 6 2.2攝影機位置 8 2.3 前景檢測 10 2.3.1 幀間差分法 10 2.3.2 光流法 11 2.3.3 背景相減法 12 2.4 目標識別 14 2.5 目標跟蹤 15 第三章視頻影像中運動目標檢測與提取 18 3.1 中值濾波 18 3.2 前景提取 20 3.2.1 單高斯背景模型 20 3.2.2 混合高斯模型 22 3.2.3 結合混合高斯模型的改進方法 25 3.3 形態學處理 28 第四章運動目的地區域的行人辨識 31 4.1 人體特徵概述 31 4.1.1 Haar-like特徵 31 4.1.2 LBP特徵 32 4.1.3 HOG特徵 32 4.2 支持向量機 36 4.3 基於HOG特徵和SVM相結合的行人辨識演算法 38 4.3.1 樣本選擇 39 4.3.2 訓練分類器 40 第五章行人追蹤與計數 41 5.1 Kalman濾波 41 5.2 BLOB資訊提取 44 5.3 基於Kalman濾波器和BLOB匹配法的目標跟蹤方法 46 5.4 人流計數 48 5.5 實際演示 51 5.6 結果分析 54 第六章結論與未來展望 62 6.1 結論 62 6.2 未來展望 62 參考文獻 64

參考文獻 References
參考文獻 [1] 陳志賢，基於視訊處理的即時人流計數系統之研究，國立高雄應用科技大學電子與資訊工程研究所碩士班論文，民國95年，pp.56-64。 [2] F. LI, Y.C. Zhang, Fast Pedestrians Counting Algorithm Based on HOG: Computer Systems & Applications. 2014, vol. 5, pp. 175-176. [3] J.W. Hsieh, C.S. Peng, K.C. Fan, Grid-based Template Matching for People Counting. IEEE 9th Workshop on Multimedia Signal Processing, Oct. 2007, pp. 316-319. [4] C. Vieren, F. Cabestaing, J. G Postarie, Catching Moving Objects with Snakes for Motion Tracking. Pattern Recognition Letters. 1995, vol.16, pp. 679-685. [5] V. Paviovic, J. Rehg, T.J. Cham, K. Murphy, A Dynamic Bayesian Network Approach to Figure Tracking Using Learned Dynamics Models. Proc. of the 7th IEEE Int. Conf. on Computer Visiton.1999, vol. 1, pp. 94-101. [6] M. Zhao, Hair-color Modeling and Head Detection, in Proc. of the 7th World Congress on Intelligent Control and Automation. 2008, pp. 7769-7772. [7] M. Li, Z.X. Zhang, K.Q. Huang, Rapid and Robust Human Detection and Tracking Based on Omega-shape Features. In Proc. of IEEE CVPR. 2009, pp. 2545-2548 [8] C. Zeng, H. Ma, Robust Head-Shoulder Detection by PCA-Based Multilevel HOG-LBP Detector for People Counting. International Conference on Pattern Recognition (ICPR) 2010, pp. 2069-2072. [9] R. Yu. Mobile App Connecting People Based on Personality Detection and Image Perception Analysis. Multimedia (ISM), 2014 IEEE International Symposium. 2014, pp. 330-340. [10] R. Collins, A System for Video Surveillance and Monitoring. VSAM Final Report. Carnegie Mellon University. Technical Report CMU-RI-TR-00-12, 2000, vol.4. pp.12-20 [11] C. Stauffer, W. Grimson, Adaptive Background Mixture Models for Real-time Tracking, IEEE Computer Vision and Pattern Recognition, 1999, vol. 2, pp.141-144 [12] K. Kim, Real-time Foreground-background Segmentation Using Codebook Model. Real-time Imaging, 2005, vol. 11, pp. 172-185. [13] O. Barnich and M.V. Droogenbroeck, ViBe: A Universal Background Subtraction Algorithm for Video Sequences, IEEE Transactions on Image Processing, 2011, vol. 20, pp. 56-60 [14] B.D. Lucas and T. Kanade, An Iterative Image Registration Technique with an Application to Stereo Vision, Proceeding of the 1981 DARPA Imaging Understanding Workshop, 1981, pp. 121-130 [15] G.J. Brostow, R. Cipolla, Unsupervised Bayesian Detection of Independent Motion in Crowds, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, pp. 594-601 [16] D. Gavrila, Pedestrian Detection from a Moving Vehicle. In: Proc. 6th European Conf. Computer Vision. 2003, vol. 2, pp.37-49 [17] T. Zhao, R. Nevatia, Tracking Multiple Humans in Complex Situations, IEEE Transactions on Pattern Analysis and Machine Intelligence, Sept. 2004, vol. 26, pp. 1208-1221 [18] J.W. Hsieh, C.S. Peng, K.C. Fan, Grid-based Template Matching for People Counting. IEEE 9th Workshop on Multimedia Signal Processing, Oct. 2007, pp. 316-319 [19] J.S.C. Yuk, K-Y.K. Wong, R.H.Y. Chung, F.Y.L. Chin, and K.P. Chow, Real-time Multiple Head Shape Detection and Tracking System with Decentralized Trackers, Intelligent 6th International Conference on Systems Design and Applications, Oct. 2006, pp. 384-389. [20] C. Zeng, H. Ma, Robust Head-shoulder Detection by PCA-Based Multilevel HOG-LBP Detector for People Counting, International Conference on Pattern Recognition, 2010, pp. 2069-2072. [21] B. Wu, R. Nevatia, Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors, International Journal of Computer Vision, 2007, vol. 75(2), pp. 247-266. [22] X. Zhao, E. Dellandréa, L. Chen, A People Counting System based on Face Detection and Tracking in a Video, International Conference on Advanced Video and Signal Based Surveillance, 2009, pp. 67-72 [23] S.F. Lin, J.Y. Chao, H.X. Chao, Estimation of Number of People in Crowded Scenes Using Perspective Transformation, IEEE Transactions on systems, man and cybernetics, 2001, vol. 31(6), pp. 645-654 [24] N. Dalal, B. Triggs, Histograms of Oriented Gradients for Human Detection, IEEE Conf. on Computer Vision and Pattern Recognition, 2005, pp. 886-893 [25] T. Zhao, R. Nevatia, Segmentation and Tracking of Multiple Humans in Crowded Environments, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(7), pp. 1198-1211 [26] I. Karaulova, P. Hall, A. Marshall, A Hierarchical Model of Dynamics for Tracking People with a Single Video Camera, Proceeding of British Machine Vision Conference, 2000, pp. 352-361 [27] D.G. Lowe, Distinctive Image Features from Scale-invariant Keypoints: International Journal of Computer Vision, 2004, vol. 60(2), pp. 91-110 [28] T.H. Chen, T.Y. Chen, Z.X. Chen, An Intelligent People-Flow Counting Method for Passing through a Gate. IEEE Conference on Robotics, Automation and Mechatronics, June 2006, pp. 1-6 [29] C. Harris, M. Stephens, A Combined Corner and Edge Detector, Proceedings of the 4th Alvey Vision Conference, 1988, pp. 147-151 [30] J. Shi, C. Tomasi, Good Features to Track, 9th IEEE Conference on Computer Vision and Pattern Recognition, June 1994, pp. 593-600 [31] H. Bay, T. Tuytelaars, L.V. Gool, SURF: Speeded Up Robust Features, European Conference on Computer Vision, 2001, vol. 1, pp. 404-417 [32] Y. Cheng, Mean Shift, Mode Seeking, and Clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995, vol. 17(8), pp. 790-799 [33] G. Welch, G. Bishiop, An Introduction to Kalman Filter, Department of Computer Science, University of North Carolina at Hill Chapel Hill, July 2006 [34] S. Bi, L. Han, Y. Zhong, et al. An Improved Non-Parametric Background Model and Two-level Classifier for Traffic Information Recognition. IEEE International Conference on CCIS, 2011, 7, pp. 495-499. [35]張錚等，數字圖像處理與機器視覺，人民郵電出版社，2014年5月。 [36] N. Friedman, S. Russell. Image Segmentation in Video Sequences: A Probabilistic Approach. Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence, 1997, pp. 175-181. [37] D. Koller, J. Weber, T. Huang, et al. Towards Robust Automatic Traffic Scene Analysis in Real-time. The 12th International Computer Vision & Image Processing Conference 1994, pp. 126-131. [38] C. Stauffer, W.E.L. Grimson. Learning Patterns of Activity Using Real-time Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, vol. 22(8), pp. 747-757. [39] P. Power, J. Schoonees. Understanding Background Mixture Models for Foregrounds Segmentation. Proceedings of Image and Vision Computing, 2002, pp. 267-271. [40] P. Kaewtrakulpong, R. Bowden. An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection. The 2nd European Workshop on Advanced Video-based Surveillance Systems. 2001, pp. 149-158.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0620116-210441.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS