Responsive image
博碩士論文 etd-0616117-182233 詳細資訊
Title page for etd-0616117-182233
論文名稱
Title
應用動態學習資料庫之物體識別
Object Recognition Using a Dynamic Learning Approach
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
106
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2017-08-17
繳交日期
Date of Submission
2017-08-29
關鍵字
Keywords
智慧型眼鏡、移動偵測、機器學習、物體辨識、影像分割、Google以圖搜尋
Motion detection, Machine learning, Google search by image, Object recognition, Image segmentation, Smart glasses
統計
Statistics
本論文已被瀏覽 5720 次,被下載 101
The thesis/dissertation has been browsed 5720 times, has been downloaded 101 times.
中文摘要
傳統的物體辨識研究,大多數會分成訓練階段與測試階段,一旦經過訓練階段建立好模組後,這些模組就再也不會做調整,使得往後若有其他新類別的影像需要做調整,就必須重新回到訓練階段建立新模組。並且這些系統通常都是架設在難以四處移動的主機上,縮小了物體辨識的應用彈性。
有鑑於此,本文將提出一套物體辨識的系統,系統以SEG(SmartEyeGlass)智慧眼鏡做為前端設備,並搭配一台後端主機進行,辨識物件。本系統分成兩大部分-移動物分割與物體辨識。對於第一部分,本系統文提出一套結合了光流法、CamShift、GrabCut三種的演算法 來提升物體辨識的精準度。它藉由使用者晃動手上的待測物,來進行移動物分割的方法。在第二部分的物體辨識過程中,會同時參考信心等級多寡、是否辨識正確、是否屬於新類別,來決定模組架構的更新策略,隨時將資料庫模組調整為最佳狀態。另外,本系統也結合了Google以圖搜尋的功能,使得即便是針對那些未經過訓練的測試影像,依然能夠進行辨識。
以上這些處理程序,若單純考慮移動物分割、物體辨識,至多僅需要耗費2秒就能夠完成這些辨識程序;甚至對於本文系統最耗時的處理程序-模組更新,平均而言每次也僅需要20秒就能夠完成更新。讓系統能夠實際應用在即時性的物體辨識。
Abstract
Most of the traditional object recognition systems consist of the training phase and the testing phase. Once the modules have been established through the training phase, these modules can not be adjusted anymore. So if there are other new classes of images to be added later, it is necessary back to the training phase to retraining the new module. In addition, these systems are not designed for mobile devices that are difficult to move around and not flexible.
In view of this, this thesis proposes a system with a set of SEG (SmartEyeGlass) as the front-end equipment and a back-end object recognition system to identify an object. This thesis is divided into two parts: moving object segmentation and object recognition. For the first part, in order to improve the accuracy of object identification, this thesis integrates optical flow, CamShift, and GrabCut to achieve a high accuracy by shaking the object to be recognized in hand. In the second part, in the dynamic learning process, simultaneously based on how much the confidence level is, whether to correctly recognize the object, and whether the object is a new category, to determine which update strategy to adopt to adjust the database module to the best state at any time. In addition, the system also uses the function of Google search by image, to recognize those untrained images.
If only for the moving object segmentation and object recognition, it takes 2 seconds to complete these recognition procedures; for the most time-consuming process in the system - the module update, it averagely takes only 20 seconds to complete the up-date. Consequently, the system can be applied in the real-time object recognition.
目次 Table of Contents
論文審定書 i
誌謝 ii
摘要 iii
Abstract iv
目錄 v
圖目錄 vii
表目錄 ix
第一章 簡介 1
第一節 論文概述 1
第二節 論文貢獻 2
第三節 論文架構 2
第二章 文獻探討 3
第一節 移動物分割 3
一﹑影像分割 3
二﹑移動偵測 4
第二節 動態式物體辨識 5
一﹑特徵擷取 5
二﹑類別分類 6
第三章 研究方法 8
第一節 移動物分割 9
一﹑移動偵測 9
二﹑物件追蹤 10
三﹑影像分割 15
四﹑綜合法 16
第二節 動態式物體辨識 25
一﹑特徵擷取 25
二﹑特徵匹配 26
三﹑正規化 27
四﹑類別分類 31
五﹑自動學習 35
六﹑靜態式資料庫 37
七﹑動態式資料庫 40
第四章 系統建置 50
第一節 系統實作 51
一﹑SEG智慧型眼鏡 51
二﹑後端伺服器 53
第二節 實驗結果 54
一﹑靜態式資料庫 54
二﹑動態式資料庫 66
三﹑Google搜尋 72
四﹑移動物分割 75
五﹑影像分割 82
第五章 結論與未來展望 92
參考文獻 93
參考文獻 References
[1] C. Vachier and F. Meyer, "The Viscous Watershed Transform," Journal of Mathe-matical Imaging and Vision 22, pp. 251-267, 2005.
[2] Y. Y. Boykov and M. P. Jolly, "Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images," IEEE International Conference on, Vol. 1, pp. 105-112, 2001.
[3] P. Elias, A. Feinstein, and C. E. Shannon, "A Note on the Maximum Flow through a Network," IRE Transactions on Information Theory 2.4, pp. 117-119, 1956.
[4] L. R. Ford Jr, and D. R. Fulkerson, "Maximal Flow through a Network," Classic papers in combinatorics, pp. 243-248, 2009.
[5] A. V. Goldberg and R. E. Tarjan, "A New Approach to the Maximum-Flow Prob-lem," Journal of the ACM 35.4, pp. 921-940, 1988.
[6] C. Rother, V. Kolmogorov, and A. Blake, "Grabcut: Interactive Foreground Ex-traction Using Iterated Graph Cuts," ACM transactions on graphics, Vol. 23, No. 3, pp. 309-314, 2004.
[7] J. MacQueen, "Some Methods for Classification and Analysis of Multivariate Observations," Proceedings of the fifth Berkeley symposium on mathematical sta-tistics and probability, Vol. 1, No. 14, pp. 281-297, 1967.
[8] C. Stauffer and W. E. L. Grimson, "Adaptive Background Mixture Models for Real-Time Tracking," Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, Vol. 2, pp. 246-252, 1999.
[9] P. KaewTraKulPong and R. Bowden, "An Improved Adaptive Background Mix-ture Model for Real-Time Tracking with Shadow Detection," Video-based sur-veillance systems, pp. 135-144, 2002.
[10] B. K. Horn and B. G. Schunck, "Determining Optical Flow," Artificial intelligence, pp. 185-203, 1981.
[11] B. D. Lucas and T. Kanade, "An Iterative Image Registration Technique with an Application to Stereo Vision," Intl Joint Conf on Artificial Intelligence (IJCAI), Vol. 81, No. 1, pp. 674-679, 1981.
[12] J. Y. Bouguet, "Pyramidal Implementation of the Affine Lucas Kanade Feature Tracker Description of the Algorithm," Intel Corporation, 5.1-10:4, 2001.
[13] C. Harris and M. Stephens, "A Combined Corner and Edge Detector," Alvey vi-sion conference, Vol. 15, No. 50, pp.10.5244, 1988.
[14] J. Shi, "Good Features to Track," Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Conference, pp. 593-600, 1994.
[15] D. Comaniciu and P. Meer, "Mean Shift: A Robust Approach Toward Feature Space Analysis," IEEE Transactions on pattern analysis and machine intelligence, 24.5, pp. 603-619, 2002.
[16] G. R. Bradski, "Computer Vision Face Tracking for Use in a Perceptual User Interface," 1998.
[17] A. Bhattacharyya, "On a Measure of Divergence Between Two Multinomial Pop-ulations," Sankhyā: the indian journal of statistics, pp. 401-406, 1946.
[18] D. Schmitt and N. McCoy, "Object Classification and Localization Using SURF Descriptors," CS229: 1-5, 2011.
[19] Q. K. Nguyen, T. L. Le, and N. H. Pham, "Leaf Based Plant Identification Sys-tem for Android Using SURF Features in Combination with Bag of Words Model and Supervised Learning," Advanced Technologies for Communications (ATC), 2013 International Conference, pp. 404-407, 2013.
[20] J. Canny, "A Computational Approach to Edge Detection," IEEE Transactions on pattern analysis and machine intelligence, pp. 679-698, 1986.
[21] D. G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Inter-national journal of computer vision 60.2, pp. 91-110, 2004.
[22] H. Bay, T. Tuytelaars, and L. Van Gool, "SURF: Speeded Up Robust Features," Computer vision-ECCV 2006, pp. 404-417, 2006.
[23] J. L. Bentley, "Multidimensional Binary Search Trees Used for Associative Searching," Communications of the ACM 18.9, pp. 509-517, 1975.
[24] J. S. Beis and D. G. Lowe, "Shape Indexing Using Approximate Near-est-Neighbour Search in High-Dimensional Spaces," Computer Vision and Pattern Recognition, 1997 IEEE Computer Society Conference, pp. 1000-1006, 1997.
[25] L. Fei-Fei and P. Perona, "A Bayesian Hierarchical Model for Learning Natural Scene Categories," Computer Vision and Pattern Recognition, IEEE Computer Society Conference, Vol. 2, pp. 524-531, 2005.
[26] D. E. Rumelhart, G. E. Hinton, and J. L. McClelland, "A General Framework for Parallel Distributed Processing," Parallel distributed processing: Explorations in the microstructure of cognition, pp. 45-76, 1986.
[27] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet Classification with Deep Convolutional Neural Networks," Advances in neural information processing sys-tems, pp. 1097-1105, 2012.
[28] C. Szegedy, et al., "Going Deeper with Convolutions," Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.
[29] C. Cortes and V. Vapnik, "Support-Vector Networks," Machine learning, 20.3, pp. 273-297, 1995.
[30] L. Fei-Fei, R. Fergus and P. Perona, "Learning Generative Visual Models
from Few Training Examples: An Incremental Bayesian Approach Tested on
101 Object Categories," Computer vision and Image understanding, pp. 59-70, 2007.
[31] N. Otsu, "A Threshold Selection Method from Gray-Level Histograms," IEEE transactions on systems, man, and cybernetics 9.1, pp. 62-66, 1979.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code