Efficient Relational Algebra Algorithms and Data Structures for GPU

dc.contributor.author Diamos, Gregory Frederick
dc.contributor.author Wu, Haicheng
dc.contributor.author Lele, Ashwin
dc.contributor.author Wang, Jin
dc.contributor.corporatename Georgia Institute of Technology. College of Computing en_US
dc.contributor.corporatename Georgia Institute of Technology. Center for Experimental Research in Computer Systems en_US
dc.contributor.corporatename nVIDIA Corporation en_US
dc.date.accessioned 2013-10-18T12:44:24Z
dc.date.available 2013-10-18T12:44:24Z
dc.date.issued 2012-02-01
dc.description.abstract Relational databases remain an important application domain for organizing and analyzing the massive volume of data generated as sensor technology, retail and inventory transactions, social media, computer vision, and new fields continue to evolve. At the same time, processor architectures are beginning to shift towards hierarchical and parallel architectures employing throughput-optimized memory systems, lightweight multi-threading, and Single-Instruction Multiple-Data (SIMD) core organizations. Examples include general purpose graphics processing units (GPUs) such as NVIDIA’s Fermi, Intels Sandy Bridge, and AMD’s Fusion processors. This paper explores the mapping of primitive relational algebra operations onto GPUs. In particular, we focus on algorithms and data structure design identifying a fundamental conflict between the structure of algorithms with good computational complexity and that of algorithms with memory access patterns and instruction schedules that achieve peak machine utilization. To reconcile this conflict, our design space exploration converges on a hybrid multi-stage algorithm that devotes a small amount of the total runtime to prune input data sets using an irregular algorithm with good computational complexity. The partial results are then fed into a regular algorithm that achieves near peak machine utilization. The design process leading to the most efficient algorithm for each stage is described, detailing alternative implementations, their performance characteristics, and an explanation of why they were ultimately abandoned. The least efficient algorithm (JOIN) achieves 57% 􀀀 72% of peak machine performance depending on the density of the input. The most efficient algorithms (PRODUCT, PROJECT, and SELECT) achieve 86% 􀀀 92% of peak machine performance across all input data sets. To the best of our knowledge, these represent the best known published results to date for any implementations. This work lays the foundation for the development of a relational database system that achieves good scalability on a Multi-level Bulk- Synchronous-Parallel (Multi-BSP) processor architecture exemplified by modern GPUs. en_US
dc.embargo.terms null en_US
dc.identifier.uri http://hdl.handle.net/1853/49227
dc.language.iso en_US en_US
dc.publisher Georgia Institute of Technology en_US
dc.relation.ispartofseries CERCS ; GIT-CERCS-12-01 en_US
dc.subject Data structure en_US
dc.subject Database en_US
dc.subject GPU en_US
dc.subject Primitive en_US
dc.title Efficient Relational Algebra Algorithms and Data Structures for GPU en_US
dc.type Text
dc.type.genre Technical Report
dspace.entity.type Publication
local.contributor.corporatename Center for Experimental Research in Computer Systems
local.relation.ispartofseries CERCS Technical Report Series
relation.isOrgUnitOfPublication 1dd858c0-be27-47fd-873d-208407cf0794
relation.isSeriesOfPublication bc21f6b3-4b86-4b92-8b66-d65d59e12c54
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
769.13 KB
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
3.13 KB
Item-specific license agreed upon to submission