Responsive image
博碩士論文 etd-0016117-135254 詳細資訊
Title page for etd-0016117-135254
論文名稱
Title
針對光線追蹤之高效能平行化BVH樹狀結構建構器的設計與實作
Design and Implementation of Highly-parallel Efficient BVH Tree Builder Architecture for Ray Tracing
系所名稱
Department
畢業學年期
Year, semester
語文別
Language
學位類別
Degree
頁數
Number of pages
70
研究生
Author
指導教授
Advisor
召集委員
Convenor
口試委員
Advisory Committee
口試日期
Date of Exam
2016-08-29
繳交日期
Date of Submission
2017-01-16
關鍵字
Keywords
光線追蹤硬體、光線追蹤、階層包圍體(BVH)、階層包圍體建構、電腦圖學
computer graphics, BVH tree construction, ray-tracing hardware, ray-tracing, BVH
統計
Statistics
本論文已被瀏覽 5673 次,被下載 65
The thesis/dissertation has been browsed 5673 times, has been downloaded 65 times.
中文摘要
階層包圍體(BVH)之樹狀結構是動態場景基於即時光線追蹤繪製的一個非常重要議題,如何透過圖形處理器(GPU)或專用電路來加速這個過程,近年來已經受到了許多關注。為了進一步達到加速建構,本論文擴展了先前學術發展的建構器之設計,建構的運作核心是將一個節點劃分為兩個子節點,並同時決定出子節點下一次分割位置。
本論文的主要貢獻之一是提出一種高效率的並行分群架構,可以同時對同一節點的數個三角形進行分群,並將分群結果均勻分佈在不同的處理單元中,以實現負載平衡。由於並行處理多個三角形,增加了新的分類盒減少(bin-reduction)單元對屬於相同分類盒的三角形做合併化簡。本論文亦增加分割候選位置取樣數,由原先7個增加至15個,以便獲取更高品質的樹狀結構。為了支援更多的分割候選位置取樣和三角形分群能力,擴展了諸如邊界產生單元、分類裁定單元等相關模組的設計,以便增加它們的處理吞吐量。關鍵的後端單元表面積啟發式(SAH)計算也被修改,使得計算週期從102減少到47。
本論文提出的樹狀結構建構器電路硬體成本約2.18M個邏輯閘,並且在90nm製程下可以運行至286 MHz工作時脈。模擬結果表明本論文的樹狀結構建構器可以實現近似文獻的建設時間,僅只需要文獻電路面積的四分之一。本論文所提出的設計可以有助於嵌入式即時光線追蹤系統的開發。
Abstract
Bounding volume hierarchy (BVH) tree construction is a very important issue for real-time ray-tracing rendering of dynamic scenes. How to accelerate this process by either graphics processor units (GPU) or dedicated circuits has attracted a lot of attention in recent years. This thesis extends one of the previous tree builder designs in order to achieve further speed-up. The core operation of the tree builder is to partition one node into two child nodes, and determine their split locations at the same time. One of the main contributions of this thesis is to propose an efficient parallel partition architecture, which can partition several triangles of the same node simultaneously, and distribute the partitioned results evenly into different processing units to achieve load balance. Because multiple triangles are processed in parallel, a new bin-reduction unit has been added to merge the binning results of those triangles belonging to the same bins. This thesis also increases the candidate split locations from seven to 15 in order to increase the resulted tree quality. In order to support more candidate splits and partitioning triangles, the design of relevant modules such as bin-border-generation unit, bin unit, have been unfolded in order to increase their processing throughput. The key backend units for Surface Area Heuristic(SAH) calculation are also modified such that the calculation cycle is reduced from 102 to 47. The proposed tree-builder costs about 2.18 M gates, and can run up to 286 MHz for 90nm technology. Our simulation results show that the proposed tree builder can achieve the similar building time to the literature but at only about one quarter of the circuit area. The proposed design can contribute to the development of embedded real-time ray-tracing systems.
目次 Table of Contents
論文審定書 i
論文公開授權書 ii
摘 要 iii
Abstract iv
目 錄 v
圖 次 vii
表 次 ix
第一章 概論 1
1.1研究動機 1
1.2論文大綱 2
第二章 研究背景與相關研究 3
2.1光線追蹤Ray Tracing 3
2.2 Bounding Volume Hierarchy(BVH) 4
2.3 Surface Area Heuristic 6
2.4 BVH結構與相關變形 7
2.5使用GPU進行分群結構建構 8
2.6光線追蹤ASIC硬體 10
第三章 BVH建構器分析 11
3.1三角形分群 11
3.1.1 Ping-pong Buffer策略 12
3.1.2 Dual Primitive Buffer策略 13
3.1.3不同分群策略之比較 15
3.2 BVH建構器介紹 17
3.3文獻[3]BVH建構器效能分析 19
第四章 BVH樹狀結構建構器的設計 22
4.1三角形分群處理(Partition Process) 25
4.1.1 三角形分群單元(Partition Unit) 25
4.1.2 Buffer讀寫排程 27
4.1.3負載平衡 35
4.2三角形分類處理(Binning Process) 36
4.2.1 邊界產生單元(Bin Border Generation Unit) 36
4.2.2分類裁定單元(Bin Decision Unit) 37
4.2.3分類化簡單元(Bin Reduction Unit) 40
4.3 SAH 計算處理(SAH Calculation Process) 41
4.3.1分群組合單元(Bin Combine Unit) 41
4.3.2 SAH Cost計算單元與SAH Cost比較單元 43
第五章 實作結果與數據比較 46
5.1 RTL模擬環境 46
5.2 場景繪製驗證 47
5.3 實驗結果與數據比較 49
第六章 結論與未來目標 54
參考文獻 56
參考文獻 References
[1] T. Whitted, “An improved illumination model for shaded display,” Commun. ACM, 23(6), 343-349, June 1980.
[2] J.-W. Kim, J.-M. Kim, M. Lee, and T.-D. Han, “Asynchronous BVH reconstruction on CPU-GPU hybrid architecture,” in ACM SIGGRAPH 2014 Posters. ACM, 2014. p. 91.
[3] 呂曜達,“光線追蹤下基於SAH指標之低成本BVH建構電路設計,”國立中山大學碩士論文, Jul. 2014
[4] M. J. Doyle, C. Fowler, and M. Manzke, “A Hardware Unit for Fast SAH-Optimised BVH Construction,” ACM Transactions on Graphics, vol. 32, no. 4, pp. 139:1–139:10, Jul. 2013.
[5] J. Goldsmith, J. Salmon, “Automatic Creation of Object Hierarchies for Ray Tracing,” IEEE Computer Graphics and Applications, vol.7, no.5, pp.14-20, May 1987.
[6] J. D. MacDonald and K. S. Booth, “Heuristics for ray tracing using space subdivision,” The Visual Computer, vol. 6, no. 3, pp. 153–166, 1990.
[7] I. Wald, S. Boulos, and P. Shirley, “Ray tracing deformable scenes using dynamic bounding volume hierarchies,” ACM Transactions on Graphics, vol. 26, no. 1, pp. 6:1-6:18, Jan 2007.
[8] 1C. Lauterbach, M. Garland, S. Sengupta, D. Luebke, and D. Manocha, “Fast BVH construction on GPUs,” in Proc. Eurographics '09, Computer Graphics Forum, vol. 28, pp. 375-384. Mar. 2009.
[9] T. Viitanen, M. Koskela, P. Jääskeläinen, H. Kultala, and J. Takala, “MergeTree: a HLBVH constructor for mobile systems,” in SIGGRAPH Asia 2015 Technical Briefs. ACM, 2015. p. 12.
[10] J. Pantaleoni and D. Luebke, “HLBVH: Hierarchical LBVH construction for real-time ray tracing of dynamic geometry,” in Proceedings of the Conference on High
Performance Graphics 2010. ACM, 2010, pp. 87-95.
[11] M. Stich, H. Friedrich, and A. Dietrich, “Spatial splits in bounding volume hierarchies,” in Proceedings of the Conference on High Performance Graphics 2009, 2009, pp. 7-13.
[12] I. Wald, “Fast construction of SAH BVHs on the Intel many integrated core (MIC) architecture,” IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 1, pp. 47–57, Jan. 2012.
[13] I. Wald, T. Ize, and S.G. Parker, “Fast, Parallel, and Asynchronous Construction of BVHs for Ray Tracing Animated Scenes,” Computers & Graphics, vol. 32, no. 1, pp. 3-13, 2008.
[14] T. Aila, and S. Laine, “Understanding the efficiency of ray traversal on GPUs,” in Proceedings of the conference on high performance graphics 2009. ACM, 2009. p. 145-149.
[15] M. Zlatuška, and V. Havran, “Ray Tracing on a GPU with CUDA-Comparative Study of Three Algorithms,” WSCG’2010, pp. 69-76, Feb. 2010.
[16] M. Hapala, T. Davidovič, I. Wald, V. Havran, and P. Slusallek, “Efficient stack-less BVH traversal for ray tracing,” in Proceedings of the 27th Spring Conference on Computer Graphics. ACM, 2011. p. 7-12.
[17] S. Wong, Y. Cheng, and S. Lii, “GPU Ray Tracing Based on Reduced Bounding Volume Hierarchies,” in Computer Graphics, Imaging and Visualization (CGIV), 2012 Ninth International Conference on. IEEE, 2012. P.1-6.
[18] J. Bittner, M. Hapala, and V. Havran, “Incremental BVH construction for ray tracing,” Computers & Graphics 2015, 47, 135 – 144
[19] R.-K.Jonathan, and I. Wald, “Reduced precision for hardware ray tracing in gpus,” in Eurographics/ACM SIGGRAPH Symposium on High Performance Graphics, The Eurographics Association, 29–40. 2014.
[20] M. Vinkler, V. Havran, J. Bittner, and J. Sochor, “Parallel on-demand hierarchy construction on contemporary GPUs,” IEEE Transactions on Visualization and Computer Graphics, vol. 22, 7, pp. 1886-1898, 2016
[21] S. Zhao, Y. Cao, Y. Guo, S. Chen, and L. Chen, “A fast spatial partition method in bounding volume hierarchy,” in Software Engineering and Service Science (ICSESS), 2013 4th IEEE International Conference on, 2013. pp. 15-18.
[22] L. R. Domingues, and H. Pedrini, “Bounding volume hierarchy optimization through agglomerative treelet restructuring,” in Proceedings of the 7th Conference on High-Performance Graphics. ACM, 2015, pp. 13-20.
[23] X. Liu, Y. Deng, Y. Ni, and Z. Lil, “FastTree: a hardware KD-tree construction acceleration engine for real-time ray tracing,” in Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 1595–1598. 2015
[24] J.-H. Nah, H.-J. Kwon, D.-S. Kim, C.-H. Jeong, J. Park, T.-D. Han, D. Manocha, and W.-C. Park, “RayCore: A ray-tracing hardware architecture for mobile devices,” ACM Transactions on Graphics, vol. 30, no. 6, 2014.
[25] W.-C. Park, H.-J. Shin, B. Lee, H. Yoon, and T.-D. Han, “RayChip: Real-time ray-tracing chip for embedded applications,” in Hot Chips 26, 2014.
[26] J. H. Nah, J. W. Kim, J. Park, W. J. Lee, J. S. Park, S. Y. Jung, W. C. Park, D. Manocha, and T. D. Han, “HART: A hybrid architecture for ray tracing animated scenes,” IEEE Transactions on Visualization and Computer Graphics, vol. 21, 3, pp. 389-401, 2015
[27] G. Liktor, and K. Vaidyanathan, “Bandwidth-efficient BVH layout for incremental hardware traversal,”in Proceedings of High Performance Graphics. Eurographics Association, 2016. p. 51-61
[28] 王群皓,“具效能分析之CPU/Cache/MMU/DRAM/Component模組異質整合於 QEMU-SystemC模擬:以三維圖形系統單晶片為例,” 國立中山大學碩士論文, Oct. 2012
[29] D. Kopta, J. Spjut, E. Brunvand, and A. Davis, “Efficient MIMD architectures for
high-performance ray tracing,” in IEEE International Conference on Computer Design, Oct 2010, pp. 9-16.
電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
論文使用權限 Thesis access permission:自定論文開放時間 user define
開放時間 Available:
校內 Campus: 已公開 available
校外 Off-campus: 已公開 available


紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊,請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。
開放時間 available 已公開 available

QR Code