Responsive image
博碩士論文 etd-0016117-135946 詳細資訊
Title page for etd-0016117-135946
論文名稱
Title
基於Altera OpenCL架構下對嵌入應用之系統平台建構與核心最佳化
System Platform Integration and Kernel Optimizations for Some Embedded Applications Based on Altera OpenCL Framework
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
106
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2017-01-06
繳交日期
Date of Submission
2017-01-16
關鍵字
Keywords
OpenCL、Altera FPGA、卷積神經網路、HOG、人像辨識
Altera FPGA, convolutional neural network, HOG, CNN, human detection, OpenCL
統計
Statistics
本論文已被瀏覽 5706 次,被下載 79
The thesis/dissertation has been browsed 5706 times, has been downloaded 79 times.
中文摘要
近年來,使用基於OpenCL的FPGA運算以進行密集計算的加速應用備受關注。使用FPGA進行加速運算不僅在功耗表現上優於一般繪圖卡 (GPU),而且與特殊應用積體電路 (ASIC) 相比具有更短的開發周期。本論文主要探討如何開發基於Altera FPGA平台的高效能OpenCL程式碼,因為Altera FPGA之執行模型與GPU截然不同。為了減少FPGA資源的耗損以及縮短kernel執行時間,本論文提出了數種高效率的kernel編寫技巧,諸如透過適當的常數陣列分區可減少FPGA的區塊記憶體使用率、kernel平行化執行並藉由global memory進行管線化處理、合併Load/Store單元可提高對記憶體存取的效率以及簡化並融合分支路徑中的描述使硬體資源得以共享。本論文應用上述技術來最佳化三個關鍵應用,包含 N-Body simulation、使用梯度方向直方圖 (Histogram of Oriented Gradient, HOG) 演算法進行人像偵測以及基於卷積神經網路 (Convolutional Neural Network, CNN) 演算法的速限標誌辨識。經最佳化後的相關應用最高可以得到40倍的加速。本論文亦整合出可執行OpenCL的嵌入式平台並運行Linux作業系統。透過重新編譯的Linux kernel可將網路攝影機應用於此平台中,透過攝影鏡頭擷取影像並進行機器分類。Linux作業系統整合了幾個關鍵的程式庫,包含影像處理常用的OpenCV用於影像擷取,OpenGL負責呈現視窗、FFMPEG用於影像解碼。本論文所整合之嵌入式系統平台亦具備SSH協定之遠端操作功能,可呈現嵌入式系統平台之數據與視窗影像於主控端螢幕。
Abstract
In recent years, accelerating compute-intensive applications by utilizing FPGA computing resources based on OpenCL interface has received a lot of attention. This scheme cannot only lead to better power efficiency compared with the use of graphic processor unit (GPU), but also have much shorter development cycle compared with the implementation of dedicated circuits. This thesis first aims to explore how to develop efficient OpenCL codes based on Altera FPGA platform since its execution model is quite different from GPU. Several coding techniques of efficient kernel such as multi constant table partitioning, kernel parallel and pipelining processing, merged data access unit, and resource sharing for branch divergence have been proposed in order to reduce either the consumption of FPGA resource or required processing time. This thesis has applied these techniques to optimize the implementation of three key applications including N-body simulation, HOG-based human detection, and speed-limit sign detection based on convolutional neural network (CNN). The speedup we have achieved for these examples can be up to 40. In addition to efficient OpenCL implementation, this thesis also integrates several key libraries including OpenGL, OpenCV, and SSH on the original Linux kernel built on Altera SOC platform in order to provide more comprehensive infrastructure. The upgraded software support can facilitate the designers to manipulate the platform and display the system output remotely, show the media contents in windows, handle the basic image processing, and capture input image using webcam.
目次 Table of Contents
論文審定書 i
論文公開授權書 ii
摘 要 iii
Abstract iv
目 錄 v
圖 次 vii
Chapter 1 概論 1
1.1 研究動機 1
1.2 論文大綱 2
Chapter 2 研究背景與相關研究 3
2.1 平行運算概覽 3
2.1.1 OpenCL 3
2.1.2 OpenCL on GPU 6
2.1.3 OpenCL on FPGA 8
2.2 Altera FPGA介紹 11
2.3 相關應用介紹 13
2.3.1 基於CNN之速限標誌辨識 14
2.3.2 基於HOG之人像偵測 15
2.3.3 N-Body Simulation 17
Chapter 3 Kernel分析與最佳化技巧 19
3.1 ND-Range Kernel與Single Work-Item Kernel 19
3.2 Kernel向量化與複製Compute Unit 20
3.3 分析與最佳化Kernel 24
3.3.1 編寫效率較佳的迴圈 24
3.3.2 合併Load/Store單元 28
3.3.3 Kernel平行化執行並藉由global memory進行管線化處理 30
3.3.4 Kernel平行化執行並藉由Channel進行資料傳遞 32
3.3.5 簡化並融合分支路徑 35
3.3.6 常數陣列分區 36
Chapter 4 嵌入式系統平台整合 39
Chapter 5 相關應用實作之結果比較 43
5.1 基於CNN之速限標誌辨識 43
5.1.1 FPGA面積使用率與效能分析 43
5.2 基於HOG之人像偵測 55
5.2.1 FPGA面積使用率與效能分析 55
5.3 N-body Simulation與嵌入式系統平台展示 66
5.3.1 FPGA面積使用率與效能分析 66
Chapter 6 結論與未來目標 72
參考文獻 73
附錄一、編譯Linux核心 75
附錄二、編譯Altera OpenCL驅動程式 77
附錄三、規劃SD卡與建置檔案系統 79
附錄四、連接周邊硬體設備並設SSH遠端連線機制 84
附錄五、安裝相關程式庫 89
附錄六、編譯並執行嵌入式應用程式 91
參考文獻 References
[1] Khronos, The OpenCL Specification Version: 2.0, 2015.
[2] E. Lindholm, J. Nickolls, S. Oberman and J. Montrym, "NVIDIA Tesla: A Unified Graphics and Computing Architecture," IEEE Micro, vol. 28, no. 2, pp. 39-55, March 2008.
[3] HSA Foundation, HSA Programmer Reference Manual Specification 1.1, 2016.
[4] AMD, GCN_Architecture_whitepaper.pdf, 2012.
[5] Altera, "Altera OpenCL," [Online]. Available: https://www.altera.com/products/design-software/embedded-software-developers/opencl/overview.html.
[6] Altera, aocl_c5soc_getting_started.pdf, 2016.
[7] Altera, cv_5v4 cyclone_v_device_handbook.pdf, 2016.
[8] Terasic, DE5-Net_User_Manual.pdf, 2014.
[9] Xilinx, "Xilinx SDAccell," [Online]. Available: https://www.xilinx.com/products/design-tools/software-zone/sdaccel.html.
[10] Intel, "Intel® X79 Express Chipset," [Online]. Available: http://ark.intel.com/products/64015/Intel-X79-Express-Chipset.
[11] M. Peemen, R. Shi, S. Lal, B. Juurlink, B. Mesman and H. Corporaal, "The neuro vector engine: Flexibility to improve convolutional net efficiency for wearable vision," in 2016 Design, Automation Test in Europe Conference Exhibition (DATE), 2016, pp. 1604-1609.
[12] 蔡玉嫻, "The VLSI Implementation of Histograms of Oriented Gradients for Human Detection," 國立成功大學, July 2011.
[13] Altera, aocl_programming_guide.pdf, 2016.
[14] Altera, aocl-best-practices-guide.pdf, 2016.
[15] Linaro, "Linaro.org," [Online]. Available: http://releases.linaro.org/14.04/ubuntu/saucy-images/developer/linaro-saucy-developer-20140410-652.tar.gz.
[16] Ubuntu wiki, "Howto Install OpenGL Development Environment," [Online]. Available: http://wiki.ubuntu-tw.org/index.php?title=Howto_Install_OpenGL_Development_Environment.
[17] KIWWITO, "Installing OpenGL/Glut libraries in Ubuntu," [Online]. Available: http://kiwwito.com/installing-opengl-glut-libraries-in-ubuntu/.
[18] M. Wagn, "How to install OpenCV 3.1 on Ubuntu 14.04," [Online]. Available: https://gist.github.com/MarcWang/0547f87cf777b6576275.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code