Title:
Memory-Efficient GroupBy-Aggregate using Compressed Buffer Trees
Memory-Efficient GroupBy-Aggregate using Compressed Buffer Trees
Author(s)
Amur, Hrishikesh
Richter, Wolfgang
Andersen, David G.
Kaminsky, Michael
Schwan, Karsten
Balachandran, Athula
Zawadzki, Erik
Richter, Wolfgang
Andersen, David G.
Kaminsky, Michael
Schwan, Karsten
Balachandran, Athula
Zawadzki, Erik
Advisor(s)
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
Memory is rapidly becoming a precious resource in many
data processing environments. This paper introduces
a new data structure called a Compressed Buffer Tree
(CBT). Using a combination of buffering, compression,
and lazy aggregation, CBTs can improve the memory
efficiency of the GroupBy-Aggregate abstraction which
forms the basis of many data processing models like
MapReduce and databases. We evaluate CBTs in the
context of MapReduce aggregation, and show that CBTs
can provide significant advantages over existing hash-based
aggregation techniques: up to 2x less memory
and 1.5x the throughput, at the cost of 2.5x CPU.
Sponsor
Date Issued
2012
Extent
Resource Type
Text
Resource Subtype
Technical Report