Title:
GRUBJOIN: An Adaptive Multi-Way Windowed Stream Join with Time Correlation-Aware CPU Load Shedding

dc.contributor.author Gedik, Bugra
dc.contributor.author Wu, Kun-Lung
dc.contributor.author Yu, Philip S.
dc.contributor.author Liu, Ling
dc.date.accessioned 2006-02-07T19:38:47Z
dc.date.available 2006-02-07T19:38:47Z
dc.date.issued 2005
dc.description.abstract Dropping tuples has been commonly used for load shedding. However, tuple dropping generally is inadequate to shed load for multiway windowed stream joins. The output rate can be unnecessarily and severely degraded because tuple dropping does not recognize time correlations likely to exist among the streams. This paper introduces GrubJoin: an adaptive multi-way windowed stream join that efficiently performs time correlation-aware CPU load shedding. GrubJoin maximizes the output rate by achieving nearoptimal window harvesting within an operator throttling framework, i.e., regulating the fractions of the join windows that are processed by the multi-way join. Window harvesting performs the join using only certain more useful segments of the join windows. Due mainly to the combinatorial explosion of possible multi-way join sequences involving various segments of individual join windows, GrubJoin faces a set of unique challenges, such as determining the optimal window harvesting configuration and learning the time correlations among the streams. To tackle these challenges, we formalize window harvesting as an optimization problem, develop greedy heuristics to determine near-optimal window harvesting configurations and use approximation techniques to capture the time correlations among the streams. Experimental results show that GrubJoin is vastly superior to tuple dropping when time correlations exist among the streams and is equally effective as tuple dropping in the absence of time correlations. en
dc.format.extent 693904 bytes
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/7711
dc.language.iso en_US en
dc.publisher Georgia Institute of Technology en
dc.relation.ispartofseries CERCS;GIT-CERCS-05-19 en
dc.subject Data streams
dc.subject GrubJoin
dc.subject Heuristics
dc.subject Load shedding
dc.subject Join algorithms
dc.subject Time correlation-aware CPU load shedding
dc.subject Tuples
dc.title GRUBJOIN: An Adaptive Multi-Way Windowed Stream Join with Time Correlation-Aware CPU Load Shedding en
dc.type Text
dc.type.genre Technical Report
dspace.entity.type Publication
local.contributor.author Liu, Ling
local.contributor.corporatename Center for Experimental Research in Computer Systems
local.relation.ispartofseries CERCS Technical Report Series
relation.isAuthorOfPublication 96391b98-ac42-4e2c-93ee-79a5e16c2dfb
relation.isOrgUnitOfPublication 1dd858c0-be27-47fd-873d-208407cf0794
relation.isSeriesOfPublication bc21f6b3-4b86-4b92-8b66-d65d59e12c54
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
git-cercs-05-19.pdf
Size:
677.64 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.86 KB
Format:
Item-specific license agreed upon to submission
Description: