CERCS Technical Report Series

Series

CERCS Technical Report Series

Permanent Link

https://hdl.handle.net/1853/70997

Series Type

Publication Series

Associated Organization(s)

Organizational Unit

Center for Experimental Research in Computer Systems

Full item page

Publication Search Results

Now showing 1 - 10 of 193

Write-Optimized Indexing for Log-Structured Key-Value Stores

(Georgia Institute of Technology, 2014) Tang, Yuzhe ; Iyengar, Arun ; Tan, Wei ; Fong, Liana ; Liu, Ling

The recent shift towards write-intensive workload on big data (e.g., financial trading, social user-generated data streams) has pushed the proliferation of the log-structured key-value stores, represented by Google’s BigTable, HBase and Cassandra; these systems optimize write performance by adopting a log-structured merge design. While providing key-based access methods based on a Put/Get interface, these key-value stores do not support value-based access methods, which significantly limits their applicability in many web and Internet applications, such as real-time search for all tweets or blogs containing “government shutdown”. In this paper, we present HINDEX, a write-optimized indexing scheme on the log-structured key-value stores. To index intensively updated big data in real time, the index maintenance is made lightweight by a design tailored to the unique characteristic of the underlying log-structured key-value stores. Concretely, HINDEX performs append-only index updates, which avoids the reading of historic data versions, an expensive operation in the log-structure store. To fix the potentially obsolete index entries, HINDEX proposes an offline index repair process through tight coupling with the routine compactions. HINDEX’s system design is generic to the Put/Get interface; we implemented a prototype of HINDEX based on HBase without internal code modification. Our experiments show that the HINDEX offers significant performance advantage for the write-intensive index maintenance.
𝜖-PPI: Searching Information Networks with Quantitative Privacy Guarantee

(Georgia Institute of Technology, 2014) Tang, Yuzhe ; Liu, Ling ; Iyengar, Arun

In information sharing networks, having a privacy preserving index (or PPI) is critically important for providing efficient search on access controlled content across distributed providers while preserving privacy. An understudied problem for PPI techniques is how to provide controllable privacy preservation, given the innate difference of privacy of the different content and providers. In this paper we present a configurable privacy preserving index, coined 𝜖-PPI, which allows for quantitative privacy protection levels on fine-grained data units. We devise a new common-identity attack that breaks existing PPI’s and propose an identity-mixing protocol against the attack in 𝜖-PPI. The proposed 𝜖-PPI construction protocol is the first without any trusted third party and/or trust relationship between providers. We have implemented our 𝜖-PPI construction protocol by using generic MPC techniques (secure multiparty computation) and optimized the performance to a practical level by minimizing the costly MPC computation part.
Authentication of Freshness for OutsourcedMulti-Version Key-Value Stores

(Georgia Institute of Technology, 2014) Tang, Yuzhe ; Wang, Ting ; Hu, Xin ; Jang, Jiyong ; Liu, Ling ; Pietzuch, Peter

Data outsourcing offers cost-effective computing power to manage massive data streams and reliable access to data. For example, data owners can forward their data to clouds, and the clouds provide data mirroring, backup, and online access services to end users. However, outsourcing data to untrusted clouds requires data authentication and query integrity to remain in the control of the data owners and users. In this paper, we address this problem specifically for multiversion key-value data that is subject to continuous updates under the constraints of data integrity, data authenticity, and “freshness” (i.e., ensuring that the value returned for a key is the latest version).We detail this problem and propose INCBMTREE, a novel construct delivering freshness and authenticity. Compared to existing work, we provide a solution that offers (i) lightweight signing and verification on massive data update streams for data owners and users (e.g., allowing for small memory footprint and CPU usage on mobile user devices), (ii) integrity of both real-time and historic data, and (iii) support for both real-time and periodic data publication. Extensive benchmark evaluations demonstrate that INCBMTREE achieves more throughput (in an order of magnitude) for data stream authentication than existing work. For data owners and end users that have limited computing power, INCBM-TREE can be a practical solution to authenticate the freshness of outsourced data while reaping the benefits of broadly available cloud services.
Distributed MIMO Interference Cancellation for Interfering Wireless Networks: Protocol and Initial Simulation

(Georgia Institute of Technology, 2013-02) Cortes-Pena, Luis Miguel ; Blough, Douglas M.

In this report, the problem of interference in dense wireless network deployments is addressed. Two example scenarios are: 1) overlapping basic service sets (OBSSes) in wireless LAN deployments, and 2) interference among close-by femtocells. The proposed approach is to exploit the interference cancellation and spatial multiplexing capabilities of multiple-input multiple- output (MIMO) links to mitigate interference and improve the performance of such networks. Both semi-distributed and fully distributed protocols for 802.11-based wireless networks standard are presented and evaluated. The philosophy of the approach is to minimize modifications to existing protocols, particularly within client-side devices. Thus, modifications are primarily made at the access points (APs). The semi-distributed protocol was fully implemented within the 802.11 package of ns-3 to evaluate the approach. Simulation results with two APs, and with either one client per AP or two clients per AP, show that within 5 seconds of network operation, our protocol increases the goodput on the downlink by about 50%, as compared against a standard 802.11n implementation.
Fast and Accurate Link Discovery Integrated with Reliable Multicast in 802.11

(Georgia Institute of Technology, 2013) Lertpratchya, Daniel ; Blough, Douglas M. ; Riley, George F.

Abstract—Maintaining accurate neighbor information in wireless networks is an important operation upon which many higher layer protocols rely. However, this operation is not supported in the IEEE 802.11 MAC layer, forcing applications that need it to each include their own neighborhood mechanism, creating redundancies and inefficiencies and failing to capitalize on potential synergies with other MAC layer operations. In this work, we propose to integrate link discovery and neighborhood maintenance with a reliable multicast extension to the IEEE 802.11 MAC.We show through simulations that our protocol adapts to neighborhood changes faster than traditional neighborhood maintenance mechanisms, thereby allowing MAC-layer multicast operations to achieve higher delivery rates. We also demonstrate that our protocol can quickly and reliably distinguish between unidirectional and bidirectional links. Traditional mechanisms assume links are bidirectional based on one-way reception of a short “hello” packet, which results in significant problems with higher-layer operations such as routing because of many unidirectional links being classified as bidirectional.
Design of a Write-Optimized Data Store

(Georgia Institute of Technology, 2013) Amur, Hrishikesh ; Andersen, David G. ; Kaminsky, Michael ; Schwan, Karsten

The WriteBuffer (WB) Tree is a new write-optimized data structure that can be used to implement per-node storage in unordered key-value stores. TheWB Tree provides faster writes than the Log-Structured Merge (LSM) Tree that is used in many current high-performance key-value stores. It achieves this by replacing compactions in LSM Trees, which are I/O-intensive, with light-weight spills and splits, along with other techniques. By providing nearly 30 higher write performance compared to current high-performance key-value stores, while providing comparable read performance (1-2 I/Os per read using 1-2B per key of memory), the WB Tree addresses the needs of a class of increasingly popular write-intensive workloads.
Energy Introspector: Coordinated Architecture-Level Simulation of Processor Physics

(Georgia Institute of Technology, 2013) Song, William J. ; Mukhopadhyay, Saibal ; Rodrigues, Arun ; Yalamanchili, Sudhakar

Increased power and heat dissipation in microprocessors impose limitations on performance scaling. Power and thermal management techniques coupled with workload dynamics cause increasing spatiotemporal variations in electrical and thermal stresses. The coupling between various physical phenomena (e.g., power, temperature, reliability, delay) will be critical to microarchitectural operations in future processors. Thus, we need modeling tools to enable the exploration of such physical interactions and drive development of microarchitectural solutions. This paper introduces a novel framework, Energy Introspector (EI), for the coordinated simulation of microarchitecture and physics models. The EI framework features flexible modeling of processor component hierarchy that enables simulating different microarchitecture and package designs. The proposed framework uses standardized interface to drive different implementations of physics models and captures their interactions. The EI supports parallel computation of models in anticipation of large-scale simulations (e.g., high core-count processors). We present a case study using the EI framework to assess reliability and performance tradeoffs with a full-system cycle-level simulation of an asymmetric chip multiprocessor (ACMP).
ClusterWatch: Flexible, Lightweight Monitoring for High-end GPGPU Clusters

(Georgia Institute of Technology, 2013) Slawinska, Magdalena ; Schwan, Karsten ; Eisenhauer, Greg

The ClusterWatch middleware provides runtime flexibility in what system-level metrics are monitored, how frequently such monitoring is done, and how metrics are combined to obtain reliable information about the current behavior of GPGPU clusters. Interesting attributes of ClusterWatch are (1) the ease with which different metrics can be added to the system—by simply deploying additional “cluster spies,” (2) the ability to filter and process monitoring metrics at their sources, to reduce data movement overhead, (3) flexibility in the rate at which monitoring is done, (4) efficient movement of monitoring data into backend stores for long-term or historical analysis, and most importantly, (5) specific support for monitoring the behavior and use of the GPGPUs used by applications. This paper presents our initial experiences with using ClusterWatch to assess the performance behavior of the a larger-scale GPGPU-based simulation code. We report the overheads seen when using ClusterWatch, the experimental results obtained for the simulation, and the manner in which ClusterWatch will interact with infrastructures for detailed program performance monitoring and profiling such as TAU or Lynx. Experiments conducted on the NICS Keeneland Initial Delivery System (KIDS), with up to 64 nodes, demonstrate low monitoring overheads for high fidelity assessments of the simulation’s performance behavior, for both its CPU and GPU components.
Power Modeling for GPU Architecture Using McPAT

(Georgia Institute of Technology, 2013) Lim, Jieun ; Lakshminarayana, Nagesh B. ; Kim, Hyesoon ; Song, William ; Yalamanchili, Sudhakar ; Sung, Wonyong

Graphics Processing Units (GPUs) are very popular for both graphics and general-purpose applications. Since GPUs operate many processing units and manage multiple levels of memory hierarchy, they consume a significant amount of power. Although several power models for CPUs are available, the power consumption of GPUs has not been studied much yet. In this paper, we develop a new power model for GPUs by utilizing McPAT, a CPU power tool. We generate initial power model data from McPAT with a detailed GPU configuration, and then adjust the models by comparing them with empirical data.We use the NVIDIA’s Fermi architecture for building the power model, and our model estimates the GPU power consumption with an average error of 7.7% and 12.8% for the microbenchmarks and Merge benchmarks, respectively.
CCM: Scalable, On-Demand Compute Capacity Management for Cloud Datacenters

(Georgia Institute of Technology, 2013) Kesavan, Mukil ; Ahmad, Irfan ; Krieger, Orran ; Soundararajan, Ravi ; Gavrilovska, Ada ; Schwan, Karsten

We present CCM (Cloud Capacity Manager) – a prototype system, and, methods for dynamically multiplexing the compute capacity of cloud datacenters at scales of thousands of machines, for diverse workloads with variable demands. This enables mitigation of resource consumption hotspots and handling unanticipated demand surges, leading to improved resource availability for applications and better datacenter utilization levels. Extending prior studies primarily concerned with accurate capacity allocation and ensuring acceptable application performance, CCM also focuses on the tradeoffs due to two unavoidable issues in large scale commodity datacenters: (i) maintaining low operational overhead, and (ii) coping with the increased incidences of management operation failures. CCM is implemented in an industry-strength cloud infrastructure built on top of the VMware vSphere virtualization platform and is currently deployed in a 700 physical host datacenter. Its experimental evaluation uses production workload traces and a suite of representative cloud applications to generate dynamic scenarios. Results indicate that the pragmatic cloud-wide nature of CCM provides up to 25% more resources for workloads and improves datacenter utilization by up to 20%, compared to the alternative approach of multiplexing capacity within multiple smaller datacenter partitions.