Person:

Liu, Ling

Permanent Link

https://hdl.handle.net/1853/71467

Associated Organization(s)

Organizational Unit

School of Computer Science

Full item page

Publication Search Results

Now showing 1 - 10 of 17

What Where Wi: an Analysis of Millions of Wi-Fi Access Points

(Georgia Institute of Technology, 2006) Jones, R. Kipp ; Liu, Ling

With the growing demand for wireless Internet access and increasing maturity of IEEE 802.11 technologies, wireless networks have sprung up by the millions throughout the world as a popular means for Internet access at homes, in offices and in public areas, such as airports, cafés and coffee shops. An increasingly popular use of IEEE 802.11 networking equipment is to provide wireless "hotspots" as the wireless access points to the Internet. These wireless access points, commonly referred to as WAPs or simply APs, are installed and managed by individuals and businesses in an unregulated manner ^Ö allowing anyone to install and operate one of these radio devices using unlicensed radio spectrum. This has allowed literally millions of these APs to become available and ^Ñvisible^Ò to any interested party who happens to be within range of the radio waves emitted from the device. As the density of these APs increases, these ^Ñbeacons^Ò can be put into multiple uses. From home networking to wireless positioning to mesh networks, there are more alternative ways for connecting wirelessly as newer, longer-range technologies come to market. This paper reports an initial study that examines a database of over 5 million wireless access points collected through wardriving by Skyhook Wireless. By performing the analytical study of this data and the information revealed by this data, including the default naming behavior, movement of access points over time, and density of access points, we found that the AP data, coupled with location information, can provide a fertile ground for understanding the "What, Where and Why" of Wi-Fi access points. More importantly, the analysis and mining of this vast and growing collection of AP data can yield important technological, social and economical results
LIRA: Lightweight, Region-aware Load Shedding in Mobile CQ Systems

(Georgia Institute of Technology, 2006) Gedik, Bugra ; Liu, Ling ; Wu, Kun-Lung ; Yu, Philip S.

Position updates and query re-evaluations are two predominant, costly components of processing location-based, continual queries (CQs) in mobile systems. To obtain high-quality query results, the query processor usually demands receiving frequent position updates from the mobile nodes. However, processing frequent updates oftentimes causes the query processor to become overloaded, under which updates must be dropped randomly, bringing down the quality of query results, negating the benefits of frequent position updates. In this paper, we develop LIRA − a lightweight, region-aware load-shedding technique for preventively reducing the position-update load of a query processor, while maintaining high-quality query results. Instead of having to receive too many updates and then randomly drop some of them, LIRA uses a region-aware partitioning mechanism to identify the most beneficial shedding regions to cut down the position updates sent by the mobile nodes within those regions. Based on the number of mobile nodes and queries in a region, LIRA judiciously applies different amounts of update reduction for different regions, maintaining better overall accuracy of query results. Experimental results show that LIRA is vastly superior to random update dropping and clearly outperforms other alternatives that do not possess full-scale, region-aware load-shedding capabilities. Moreover, due to its lightweight nature, LIRA introduces very little overhead.
Scalable Access Control in Content-Based Publish-Subscribe Systems

(Georgia Institute of Technology, 2006) Srivatsa, Mudhakar ; Liu, Ling

Content-based publish-subscribe (pub-sub) systems are an emerging paradigm for building a large number of distributed systems. Access control in a pub-sub system refers to secure distribution of events to clients subscribing to those events without revealing its secret attributes to the unauthorized subscribers. To provide confidentiality guarantees the secret attributes in an event is encrypted so that only authorized subscribers can read them. However, in a content-based pub-sub system, every event can potentially have a different set of authorized subscribers. In the worst case, for NS subscribers, there are 2^NS subgroups, and each event can potentially go to a different subgroup. Hence, efficient key management is a big challenge for implementing access control in pub-sub systems. In this paper, we describe efficient and scalable key management algorithms for securely implementing access control rules in pub-sub systems. We ensure that the key management cost is linear in the number of subscriptions and completely independent of the number of subscribers NS. We present a concrete implementation of our proposal on an operational pub-sub system. An experimental evaluation of our prototype shows that our proposal meets the security requirements while maintaining the scalability and performance of the pub-sub system.
Adaptive Load Shedding for Windowed Stream Joins

(Georgia Institute of Technology, 2005) Gedik, Bugra ; Wu, Kun-Lung ; Yu, Philip S. ; Liu, Ling

We present an adaptive load shedding approach for windowed stream joins. In contrast to the conventional approach of dropping tuples from the input streams, we explore the concept of selective processing for load shedding, focusing on costly stream joins such as those over set-valued or weighted set-valued attributes. The main idea of our adaptive load shedding approach is two-fold. First, we allow stream tuples to be stored in the windows and shed excessive CPU load by performing the stream join operations, not on the entire set of tuples within the windows, but on a dynamically changing subset of tuples that are highly beneficial. Second, we support such dynamic selective processing through three forms of runtime adaptations: By adaptation to input stream rates, we perform partial processing based load shedding and dynamically determine the fraction of the windows to be processed by comparing the tuple consumption rate of join operation to the incoming stream rates. By adaptation to time correlation between the streams, we dynamically determine the number of basic windows to be used and prioritize the tuples for selective processing, encouraging CPU-limited execution of stream joins in high priority basic windows. By adaptation to join directions, we dynamically determine the most beneficial direction to perform stream joins in order to process more useful tuples under heavy load conditions and boost the utility or number of output tuples produced. Our load shedding framework not only enables us to integrate utility-based load shedding with time correlation-based load shedding, but more importantly, it also allows load shedding to be adaptive to various dynamic stream properties. Inverted indexes are used to further speed up the execution of stream joins based on set-valued attributes. Experiments are conducted to evaluate the effectiveness of our adaptive load shedding approach in terms of output rate and utility.
GRUBJOIN: An Adaptive Multi-Way Windowed Stream Join with Time Correlation-Aware CPU Load Shedding

(Georgia Institute of Technology, 2005) Gedik, Bugra ; Wu, Kun-Lung ; Yu, Philip S. ; Liu, Ling

Dropping tuples has been commonly used for load shedding. However, tuple dropping generally is inadequate to shed load for multiway windowed stream joins. The output rate can be unnecessarily and severely degraded because tuple dropping does not recognize time correlations likely to exist among the streams. This paper introduces GrubJoin: an adaptive multi-way windowed stream join that efficiently performs time correlation-aware CPU load shedding. GrubJoin maximizes the output rate by achieving nearoptimal window harvesting within an operator throttling framework, i.e., regulating the fractions of the join windows that are processed by the multi-way join. Window harvesting performs the join using only certain more useful segments of the join windows. Due mainly to the combinatorial explosion of possible multi-way join sequences involving various segments of individual join windows, GrubJoin faces a set of unique challenges, such as determining the optimal window harvesting configuration and learning the time correlations among the streams. To tackle these challenges, we formalize window harvesting as an optimization problem, develop greedy heuristics to determine near-optimal window harvesting configurations and use approximation techniques to capture the time correlations among the streams. Experimental results show that GrubJoin is vastly superior to tuple dropping when time correlations exist among the streams and is equally effective as tuple dropping in the absence of time correlations.
Energy-Aware Data Collection in Sensor Networks: A Localized Selective Sampling Approach

(Georgia Institute of Technology, 2005) Gedik, Bugra ; Liu, Ling

One of the most prominent and comprehensive ways of data collection in sensor networks is to periodically extract raw sensor readings. This way of data collection enables complex analysis of data, which may not be possible with in-network aggregation or query processing. However, this flexibility in data analysis comes at the cost of power consumption. In this paper, we introduce selective sampling for energy-efficient periodic data collection in sensor networks. The main idea behind selective sampling is to use a dynamically changing subset of nodes as samplers such that the sensor readings of sampler nodes are directly collected, whereas the values of non-sampler nodes are predicted through the use of probabilistic models that are locally and periodically constructed in an in-network manner. Selective sampling can be effectively used to increase the network lifetime while keeping quality of the collected data high, in scenarios where either the spatial density of the network deployment is superfluous relative to the required spatial resolution for data analysis or certain amount of data quality can be traded off in order to decrease the overall power consumption of the network. Our selective sampling approach consists of three main mechanisms. First, sensing-driven cluster construction is used to create clusters within the network such that nodes with close sensor readings are assigned to the same clusters. Second, correlation-based sampler selection and model derivation is used to determine the sampler nodes and to calculate the parameters of probabilistic models that capture the spatial and temporal correlations among sensor readings. Last, selective data collection and model-based prediction is used to minimize the number of messages used to extract data from the network. A unique feature of our selective sampling mechanisms is the use of localized schemes, as opposed to the protocols requiring global information, to select and dynamically refine the subset of sensor nodes serving as samplers and the modelbased value prediction for non-sampler nodes. Such runtime adaptations create a data collection schedule which is self-optimizing in response to changes in energy levels of nodes and environmental dynamics.
Energy Efficient Exact kNN Search in Wireless Broadcast Environments

(Georgia Institute of Technology, 2004-05-24) Gedik, Bugra ; Singh, Aameek ; Liu, Ling

The advances in wireless communication and decreasing costs of mobile devices have enabled users to access desired information at any time. Coupled with positioning technologies like GPS, this opens up an exciting domain of location based services, allowing a mobile user to query for objects based on its current position. Main bottlenecks in such infrastructures are the draining of power of the mobile devices and the limited network bandwidth available. To alleviate these problems, broadcasting spatial information about relevant objects has been widely accepted as an efficient mechanism. An important class of queries for such an infrastructure is the k-nearest neighbor (kNN) queries, in which users are interested in k closest objects to their position. Most of the research in kNN queries, use unconventional broadcast indexes and provide only approximate kNN search. In this paper, we describe mechanisms to perform exact kNN search on conventional sequential-access R-trees, and optimize established kNN search algorithms. We also propose a novel use of histograms for guiding the search and derive analytical results on maximum queue size and node access count. In addition, we discuss the effects of different broadcast organizations on search performance and challenge the traditional use of Depth-First (dfs) organization. We also extend our mechanisms to support kNN search with non-spatial constraints. While we demonstrate our ideas using a broadcast index, they are equally applicable to any kind of sequential access medium like tertiary tape storage. We validate our mechanims through an extensive experimental analysis and present our findings.
Reliable End System Multicasting with a Heterogeneous Overlay Network

(Georgia Institute of Technology, 2004-05-03) Zhang, Jianjun ; Liu, Ling ; Pu, Calton ; Ammar, Mostafa H.

This paper presents PeerCast, a reliable and self-configurable peer to peer system for End System Multicast (ESM). Our approach has three unique features compared with existing approaches to application-level multicast systems. First, we propose a capacity-aware overlay construction technique to balance the multicast load among peers with heterogeneous capabilities. Second, we utilize the landmark signature technique to cluster peer nodes of the ESM overlay network, aiming at exploiting the network proximity of end system nodes for efficient multicast group subscription and fast dissemination of information across wide area networks. Third and most importantly, we develop a dynamic passive replication scheme to provide reliable subscription and multicast dissemination of information in an environment of inherently unreliable peers. We also present an analytical model to discuss its fault tolerance properties, and report a set of initial experiments, showing the feasibility and the effectiveness of the proposed approach.
A Customizable k-Anonymity Model for Protecting Location Privacy

(Georgia Institute of Technology, 2004-04-07) Gedik, Bugra ; Liu, Ling

Continued advances in mobile networks and positioning technologies have created a strong market push for location-based services (LBSs). Examples include location-aware emergency services, location based service advertisement, and location sensitive billing. One of the big challenges in wide deployment of LBS systems is the privacy-preserving management of location-based data. Without safeguards, extensive deployment of location based services endangers location privacy of mobile users and exhibits significant vulnerabilities for abuse. In this paper, we describe a customizable k-anonymity model for protecting privacy of location data. Our model has two unique features. First, we provide a customizable framework to support k-anonymity with variable k, allowing a wide range of users to benefit from the location privacy protection with personalized privacy requirements. Second, we design and develop a novel spatio-temporal cloaking algorithm, called CliqueCloak, which provides location k-anonymity for mobile users of a LBS provider. The cloaking algorithm is run by the location protection broker on a trusted server, which anonymizes messages from the mobile nodes by cloaking the location information contained in the messages to reduce or avoid privacy threats before forwarding them to the LBS provider(s). Our model enables each message sent from a mobile node to specify the desired level of anonymity as well as the maximum temporal and spatial tolerances for maintaining the required anonymity. We study the effectiveness of the cloaking algorithm under various conditions using realistic location data synthetically generated using real road maps and traffic volume data. Our experiments show that the location k-anonymity model with multi-dimensional cloaking and tunable k parameter can achieve high guarantee of k anonymity and high resilience to location privacy threats without significant performance penalty.
Agyaat: Providing Mutually Anonymous Services over Structured P2P Networks

(Georgia Institute of Technology, 2004-03-23) Singh, Aameek ; Liu, Ling

In the modern era of ubiquitous computing, privacy is one of the most critical user concerns. To prevent their privacy, users typically, try to remain anonymous to the service provider. This is especially true for decentralized Peer-to-Peer (P2P) systems, where common users act both as clients and as service providers. Preserving privacy in such cases requires mutual anonymity, which shields the users at both ends. Most unstructured P2P systems like Gnutella, Kazaa provide a certain level of anonymity through the use of a random overlay topology and a flooding based routing protocol, but suffer from the lack of guaranteed lookup of data. In contrast, most structured P2P systems like Chord, are Distributed Hash Table (DHT) based systems and provide guarantees that any stored data item can be found within a bounded number of hops. However, none of the existing DHT systems provide any mutual anonymity. In this paper, we present Agyaat - a decentralized P2P system that has the desired properties of privacy-preserving mutual anonymity and still accomplishes the performance benefits of scalable and guaranteed lookups. A unique characteristic of its design is its low-cost, yet highly effective approach to support mutual anonymity. Instead of adding explicit anonymity services to the network, Agyaat advocates the utilization of unstructured topologies, referred as clouds, over structured DHT overlays. Cloud topologies have an important feature of local query termination, which is critical to facilitate mutual anonymity. To overcome the drawbacks of typical Gnutella like systems, Agyaat introduces a number of novel mechanisms that enhance the scalability and efficiency of routing. Compared with existing pure DHT based systems, Agyaat provides mutual anonymity while ensuring similar routing performance (differing only by constants) in terms of both number of hops and aggregate messaging costs. We validate the Agyaat solution in two steps. First, we conduct a set of experiments to analyze the system performance and compare it with other popular pure DHT based systems. Second, we perform a thorough security (anonymity) analysis under the passive logging model. We discuss possible privacy compromising attacks and their impact, and propose various defenses to thwart such attacks.