Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs

MICRO 2024


...

Abstract


Tree traversal is a fundamental operation in many applications, such as database indexing and physics simulations. Although tree traversals feature high parallelism, they are inherently divergent and irregular, leading to inefficient performance on GPUs. Tree traversals are also prevalent in ray tracing, which is executed on dedicated Ray-Tracing Accelerators (RTAs) in modern GPUs to mitigate inefficiencies such as control flow divergence and underutilization of memory bandwidth by irregular memory accesses. In this paper, we propose the Tree Traversal Accelerator (TTA) to replicate the success of RTAs in ray tracing for general tree traversal applications. TTAs extend RTAs to support tree structures and operations beyond those in ray tracing, such as B-Tree search and radius search algorithms, by modifying existing computing units. Despite TTAs' effectiveness, they still rely on fixed-function computations, making it challenging to support other tree-based applications such as N-Body simulation fully. Thus, we introduce TTA+ as an alternative design, which modularizes the RTA computing units and makes them programmable, trading some efficiency for flexibility. With less than 1% increase in RTA area, our proposals can achieve up to 5.4x speedup for B-Tree search, 1.7x for N-Body simulation, and 1.2x for select ray-tracing applications.

Resources


Download our paper here

Simulator code is available at https://github.com/ubc-aamodt-group/vulkan-sim/tree/generalized-tta

Full artifact on Zenodo https://doi.org/10.5281/zenodo.13294689