Title:
Write-Optimized Indexing for Log-Structured Key-Value Stores
Write-Optimized Indexing for Log-Structured Key-Value Stores
Authors
Tang, Yuzhe
Iyengar, Arun
Tan, Wei
Fong, Liana
Liu, Ling
Iyengar, Arun
Tan, Wei
Fong, Liana
Liu, Ling
Authors
Person
Advisors
Advisors
Associated Organizations
Organizational Unit
Series
Collections
Supplementary to
Permanent Link
Abstract
The recent shift towards write-intensive workload on
big data (e.g., financial trading, social user-generated data
streams) has pushed the proliferation of the log-structured key-value stores, represented by Google’s BigTable, HBase
and Cassandra; these systems optimize write performance by
adopting a log-structured merge design. While providing key-based
access methods based on a Put/Get interface, these
key-value stores do not support value-based access methods,
which significantly limits their applicability in many web and Internet applications, such as real-time search for all tweets or blogs containing “government shutdown”. In this paper, we present HINDEX, a write-optimized indexing scheme on the log-structured key-value stores. To index intensively
updated big data in real time, the index maintenance is made
lightweight by a design tailored to the unique characteristic of the underlying log-structured key-value stores. Concretely, HINDEX performs append-only index updates, which avoids the reading of historic data versions, an expensive operation
in the log-structure store. To fix the potentially obsolete index entries, HINDEX proposes an offline index repair
process through tight coupling with the routine compactions. HINDEX’s system design is generic to the Put/Get interface;
we implemented a prototype of HINDEX based on HBase
without internal code modification. Our experiments show
that the HINDEX offers significant performance advantage for the write-intensive index maintenance.
Sponsor
Date Issued
2014
Extent
Resource Type
Text
Resource Subtype
Technical Report