國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,應用生成對抗網路於十類圖片辨識,Ten Classes of Image Recognition Using Generative Adversarial Networks

論文名稱 Title	應用生成對抗網路於十類圖片辨識 Ten Classes of Image Recognition Using Generative Adversarial Networks
系所名稱 Department	資訊工程學系 Department of Computer Science and Engineering
畢業學年期 Year, semester	105 學年度第 2 學期 The spring semester of Academic Year 105	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	60
研究生 Author	詹佳憲 Chia-hsien Chan
指導教授 Advisor	陳嘉平 Chia-Ping Chen
召集委員 Convenor	王新民 Hsin-Min Wang
口試委員 Advisory Committee	禹良治, 蕭勝夫, 吳宗憲 Liang-Chih Yu; Shen-Fu Hsiao; Chung-Hsien Wu
口試日期 Date of Exam	2017-07-21	繳交日期 Date of Submission	2017-08-21
關鍵字 Keywords	生成對抗網路、對抗學習、全卷積神經網路、圖像辨識、深度卷積生成對抗網路 adversarial training, image recognition, generative adversarial netwoeks, all convolutional net, deep convolutional generative adversarial netwoeks
統計 Statistics	本論文已被瀏覽 5662 次，被下載 5 次 The thesis/dissertation has been browsed 5662 times, has been downloaded 5 times.

中文摘要
本論文用深度卷積生成對抗網路來產生假的圖片資料，並以此增加訓練分類模型時的訓練資料量。生成對抗網路藉由賽局理論的方式，讓鑑別網路與生成網路進行對抗訓練，由此來學習並模擬真實資料的分布。接著利用學到的分布和權重來將隨機亂數向量，映射為各類別的假圖片。本論文使用CIFAR-10 與MNIST 圖像資料庫。這兩個資料庫被廣泛應用於圖像辨識的研究，而本論文亦將使用其訓練資料對生成對抗網路和分類器做訓練，並以測試資料來做評估。首先以CIFAR-10 與MNIST的訓練集分別訓練一個架構為全卷積神經網路的分類器，並得到baseline分別為90.09%以及99.61%。接著本論文用全卷積神經網路架構做為鑑別網路與生成網路，利用全卷積神經網路能有效學習圖像區域特徵的特性，讓鑑別網路能有更高的信心水準區分真圖片與假圖片的差異。同時生成網路為了騙過鑑別網路也會更精準的去模仿真圖片的細部特徵。本研究將嘗試以不同方法對圖像進行前處理，並用於訓練生成對抗網路。亦將比較各種圖像處理方式所訓練的生成對抗網路，其所產生假圖彼此間的差異性。接著使用訓練完成的生成網路產生假圖，以分類器及最近鄰演算法對假圖做挑選和標記。最後使用假圖來增加訓練資料量，以此幫助訓練分類模型。經由實驗結果證實，利用生成網路產生假圖增加訓練資料量此一方法，能有效提升辨識率。
Abstract
In this paper, we use artificial examples synthesized by deep convolutional generative adversarial netwoeks in image recognition. First, class dependent generative adversarial networks (GAN) are trained through a game-theoretic setting for a game between Discriminator and Generator to learn the distributions of the classes, and then mapped the random vector to the artificial examples by the distributions of the classes. We use Use CIFAR-10 and MNIST Image database. These two databases are widely used in image recognition research. We also use its training data to train generative adversarial netwoeks and classifier, and use the test data to do assessment. We use CIFAR-10 and MNIST to trained an all convolutional network and get the baseline respectively 90.09% and 99.61%. Then, we used the all convolutional network architecture as Discriminator and Generator.The advantages of using the convolutional network, which learn the regional feature well, make Discriminator distinguish the difference between real data and artificial examples more accurately. Similarly, Generator will be more accurate to imitate the details of the real data in order to fool Discriminator. We try to preprocess the image in different ways and use it for training generative adversarial netwoeks. Then use the trained network to generate artificial examples, and use the classifier and the nearest neighbor algorithm to select the artificial examples. Finally, we use artificial examples to increase the amount of training data and train the classifier. The experimental results show that this method can effectively improve the recognition rate.

目次 Table of Contents
論文審定書. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .ii 摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .iii ABSTRACT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Table of Contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Chapter 1 簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 1.1 背景與研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 文獻回顧. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 MNIST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.2 CIFAR-10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.3 生成對抗網路(Generative Adversarial Networks, GAN) . . . . . . . 4 Chapter 2 基本架構與工具介紹. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 2.1 生成對抗網路(Generative Adversarial Networks, GAN) . . . . . . . . . . . 6 2.1.1 Discriminator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 最近鄰演算法(K nearest neighbor, KNN) . . . . . . . . . . . . . . . . . . . 7 2.3 卷積神經網路（Convolutional Neural Network, CNN） . . . . . . . . . . . 8 2.4 工具介紹:Tensorflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 3 資料庫介紹與研究方法. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10 3.1 資料庫介紹. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1.1 MNIST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1.2 CIFAR-10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 全卷積神經網路(All Convolutional Net) . . . . . . . . . . . . . . . . . . . 11 3.3 批量正規化(Batch Normalization) . . . . . . . . . . . . . . . . . . . . . . . 12 3.4 深度卷積生成對抗網路(Deep Convolutional Generative Adversarial Netwoeks) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter 4 實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17 4.1 實驗設定. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 基準實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 CIFAR-10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.2 MNIST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 生成樣本實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.1 原圖訓練DCGAN實驗. . . . . . . . . . . . . . . . . . . . . . . . 22 4.3.2 先影像處理再訓練DCGAN . . . . . . . . . . . . . . . . . . . . . . 24 4.3.3 先rescale再訓練DCGAN . . . . . . . . . . . . . . . . . . . . . . . 26 4.3.4 影像處理加rescale再訓練DCGAN . . . . . . . . . . . . . . . . . . 27 4.3.5 MNIST訓練DCGAN實驗. . . . . . . . . . . . . . . . . . . . . . . 27 4.3.6 評估DCGAN訓練收斂情形. . . . . . . . . . . . . . . . . . . . . . 29 4.4 分類模型評估生成樣本. . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.5 最近鄰法評估生成樣本. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.6 生成樣本增加訓練資料. . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.7 挑選生成樣本再加入訓練集實驗. . . . . . . . . . . . . . . . . . . . . . . 36 4.7.1 分類模型挑選生成樣本. . . . . . . . . . . . . . . . . . . . . . . . 36 4.7.2 最近鄰法挑選生成樣本. . . . . . . . . . . . . . . . . . . . . . . . 38 4.8 加入水平翻轉訓練資料與生成樣本實驗. . . . . . . . . . . . . . . . . . . 39 4.8.1 訓練資料水平翻轉實驗. . . . . . . . . . . . . . . . . . . . . . . . 40 4.8.2 水平翻轉資料加分類模型挑選生成樣本實驗. . . . . . . . . . . . 40 4.8.3 水平翻轉資料加最近鄰法挑選生成樣本實驗. . . . . . . . . . . . 41 4.8.4 實驗結果統整. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.9 MNIST實驗. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Chapter 5 結論與未來展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43

參考文獻 References
[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. [2] D. Ciregan, U. Meier, and J. Schmidhuber, “Multi-column deep neural networks for image classification,” in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pp. 3642–3649, IEEE, 2012. [3] L. Wan, M. Zeiler, S. Zhang, Y. L. Cun, and R. Fergus, “Regularization of neural networks using dropconnect,” in Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 1058–1066, 2013. [4] D. C. Ciresan, U. Meier, L. M. Gambardella, and J. Schmidhuber, “Deep big simple neural nets excel on handwritten digit recognition, 2010,” Cited on, vol. 80. [5] B. Paszkowski, W. Bieniecki, and S. Grabowski, “Preprocessing for real-time handwritten character recognition,” in Computer Recognition Systems 2, pp. 470–476, Springer, 2007. [6] T.-H. Chan, K. Jia, S. Gao, J. Lu, Z. Zeng, and Y. Ma, “Pcanet: A simple deep learning baseline for image classification?,” IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5017–5032, 2015. [7] J.-R. Chang and Y.-S. Chen, “Batch-normalized maxout network in network,” arXiv preprint arXiv:1511.02583, 2015. [8] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller, “Striving for simplicity: The all convolutional net,” arXiv preprint arXiv:1412.6806, 2014. [9] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio, “Maxout networks,” arXiv preprint arXiv:1302.4389, 2013. [10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, pp. 1097–1105, 2012. [11] S. Zagoruyko, “92.45% on cifar-10 in torch,” Torch Blog, 2015. [12] B. Graham, “Fractional max-pooling,” arXiv preprint arXiv:1412.6071, 2014. [13] C. Szegedy,W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013. [14] A. Nguyen, J. Yosinski, and J. Clune, “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 427–436, 2015. [15] P. Tabacof and E. Valle, “Exploring the space of adversarial images,” in Neural Networks (IJCNN), 2016 International Joint Conference on, pp. 426–433, IEEE, 2016. [16] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014. [17] T. Miyato, A. M. Dai, and I. Goodfellow, “Adversarial training methods for semisupervised text classification,” arXiv preprint arXiv:1605.07725, 2016. [18] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, pp. 2672–2680, 2014. [19] E. L. Denton, S. Chintala, R. Fergus, et al., “Deep generative image models using alaplacian pyramid of adversarial networks,” in Advances in neural information processing systems, pp. 1486–1494, 2015. [20] J. Gauthier, “Conditional generative adversarial nets for convolutional face generation,” Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester, vol. 2014, p. 5, 2014. [21] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015. [22] A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, and R. Webb, “Learning from simulated and unsupervised images through adversarial training,” arXiv preprint arXiv:1612.07828, 2016. [23] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：已公開 available 校外 Off-campus：已公開 available etd-0721117-160522.pdf
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 已公開 available

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2452 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2452 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS