Series
Doctor of Philosophy with a Major in Computer Science

Series Type
Degree Series
Description
Associated Organization(s)
Associated Organization(s)

Publication Search Results

Now showing 1 - 10 of 45
  • Item
    Accelerating microarchitectural simulation via statistical sampling principles
    (Georgia Institute of Technology, 2012-12-05) Bryan, Paul David
    The design and evaluation of computer systems rely heavily upon simulation. Simulation is also a major bottleneck in the iterative design process. Applications that may be executed natively on physical systems in a matter of minutes may take weeks or months to simulate. As designs incorporate increasingly higher numbers of processor cores, it is expected the times required to simulate future systems will become an even greater issue. Simulation exhibits a tradeoff between speed and accuracy. By basing experimental procedures upon known statistical methods, the simulation of systems may be dramatically accelerated while retaining reliable methods to estimate error. This thesis focuses on the acceleration of simulation through statistical processes. The first two techniques discussed in this thesis focus on accelerating single-threaded simulation via cluster sampling. Cluster sampling extracts multiple groups of contiguous population elements to form a sample. This thesis introduces techniques to reduce sampling and non-sampling bias components, which must be reduced for sample measurements to be reliable. Non-sampling bias is reduced through the Reverse State Reconstruction algorithm, which removes ineffectual instructions from the skipped instruction stream between simulated clusters. Sampling bias is reduced via the Single Pass Sampling Regimen Design Process, which guides the user towards selected representative sampling regimens. Unfortunately, the extension of cluster sampling to include multi-threaded architectures is non-trivial and raises many interesting challenges. Overcoming these challenges will be discussed. This thesis also introduces thread skew, a useful metric that quantitatively measures the non-sampling bias associated with divergent thread progressions at the beginning of a sampling unit. Finally, the Barrier Interval Simulation method is discussed as a technique to dramatically decrease the simulation times of certain classes of multi-threaded programs. It segments a program into discrete intervals, separated by barriers, which are leveraged to avoid many of the challenges that prevent multi-threaded sampling.
  • Item
    End-to-end inference of internet performance problems
    (Georgia Institute of Technology, 2012-11-15) Kanuparthy, Partha V.
    Inference, measurement and estimation of network path properties is a fundamental problem in distributed systems and networking. We consider a specific subclass of problems which do not require special support from the hardware or software, deployment of special devices or data from the network. Network inference is a challenging problem since Internet paths can have complex and heterogeneous configurations. Inference enables end users to understand and troubleshoot their connectivity and verify their service agreements; it has policy implications from network neutrality to broadband performance; and it empowers applications and services to adapt to network paths to improve user quality of experience. In this dissertation we develop end-to-end user-level methods, tools and services for network inference. Our contributions are as follows. We show that domain knowledge-based methods can be used to infer performance of different types of networks, containing wired and wireless links, and ranging from local area to inter-domain networks. We develop methods to infer network properties: 1. Traffic discrimination (DiffProbe), 2. Traffic shapers and policers (ShaperProbe), and 3. Shared links among multiple paths (Spectral Probing). We develop methods to understand network performance: 1. Diagnose wireless performance pathologies (WLAN-probe), and 2. Diagnose wide-area performance pathologies (Pythia). Among our contributions: We have provided ShaperProbe as a public service and it has received over 1.5 million runs from residential and commercial users, and is used to check service level agreements by thousands of residential broadband users a day. The Federal Communications Commission (FCC) has recognized DiffProbe and ShaperProbe with the best research award in the Open Internet Apps Challenge in 2011. We have written an open source performance diagnosis system, Pythia, and it is being deployed in ISPs such as the US Department of Energy ESnet in wide-area inter-domain settings. The contributions of this dissertation enable Internet transparency, performance troubleshooting and improving distributed systems performance.
  • Item
    Interactive analogical retrieval: practice, theory and technology
    (Georgia Institute of Technology, 2012-08-24) Vattam, Swaroop
    Analogy is ubiquitous in human cognition. One of the important questions related to understanding the situated nature of analogy-making is how people retrieve source analogues via their interactions with external environments. This dissertation studies interactive analogical retrieval in the context of biologically inspired design (BID). BID involves creative use of analogies to biological systems to develop solutions for complex design problems (e.g., designing a device for acquiring water in desert environments based on the analogous fog-harvesting abilities of the Namibian Beetle). Finding the right biological analogues is one of the critical first steps in BID. Designers routinely search online in order to find their biological sources of inspiration. But this task of online bio-inspiration seeking represents an instance of interactive analogical retrieval that is extremely time consuming and challenging to accomplish. This dissertation focuses on understanding and supporting the task of online bio-inspiration seeking. Through a series of field studies, this dissertation uncovered the salient characteristics and challenges of online bio-inspiration seeking. An information-processing model of interactive analogical retrieval was developed in order to explain those challenges and to identify the underlying causes. A set of measures were put forth to ameliorate those challenges by targeting the identified causes. These measures were then implemented in an online information-seeking technology designed to specifically support the task of online bio-inspiration seeking. Finally, the validity of the proposed measures was investigated through a series of experimental studies and a deployment study. The trends are encouraging and suggest that the proposed measures has the potential to change the dynamics of online bio-inspiration seeking in favor of ameliorating the identified challenges of online bio-inspiration seeking.
  • Item
    Dynamic program analysis algorithms to assist parallelization
    (Georgia Institute of Technology, 2012-08-24) Kim, Minjang
    All market-leading processor vendors have started to pursue multicore processors as an alternative to high-frequency single-core processors for better energy and power efficiency. This transition to multicore processors no longer provides the free performance gain enabled by increased clock frequency for programmers. Parallelization of existing serial programs has become the most powerful approach to improving application performance. Not surprisingly, parallel programming is still extremely difficult for many programmers mainly because thinking in parallel is simply beyond the human perception. However, we believe that software tools based on advanced analyses can significantly reduce this parallelization burden. Much active research and many tools exist for already parallelized programs such as finding concurrency bugs. Instead we focus on program analysis algorithms that assist the actual parallelization steps: (1) finding parallelization candidates, (2) understanding the parallelizability and profits of the candidates, and (3) writing parallel code. A few commercial tools are introduced for these steps. A number of researchers have proposed various methodologies and techniques to assist parallelization. However, many weaknesses and limitations still exist. In order to assist the parallelization steps more effectively and efficiently, this dissertation proposes Prospector, which consists of several new and enhanced program analysis algorithms. First, an efficient loop profiling algorithm is implemented. Frequently executed loop can be candidates for profitable parallelization targets. The detailed execution profiling for loops provides a guide for selecting initial parallelization targets. Second, an efficient and rich data-dependence profiling algorithm is presented. Data dependence is the most essential factor that determines parallelizability. Prospector exploits dynamic data-dependence profiling, which is an alternative and complementary approach to traditional static-only analyses. However, even state-of-the-art dynamic dependence analysis algorithms can only successfully profile a program with a small memory footprint. Prospector introduces an efficient data-dependence profiling algorithm to support large programs and inputs as well as provides highly detailed profiling information. Third, a new speedup prediction algorithm is proposed. Although the loop profiling can give a qualitative estimate of the expected profit, obtaining accurate speedup estimates needs more sophisticated analysis. Prospector introduces a new dynamic emulation method to predict parallel speedups from annotated serial code. Prospector also provides a memory performance model to predict speedup saturation due to increased memory traffic. Compared to the latest related work, Prospector significantly improves both prediction accuracy and coverage. Finally, Prospector provides algorithms that extract hidden parallelism and advice on writing parallel code. We present a number of case studies how Prospector assists manual parallelization in particular cases including privatization, reduction, mutex, and pipelining.
  • Item
    Visualizing and modeling partial incomplete ranking data
    (Georgia Institute of Technology, 2012-08-23) Sun, Mingxuan
    Analyzing ranking data is an essential component in a wide range of important applications including web-search and recommendation systems. Rankings are difficult to visualize or model due to the computational difficulties associated with the large number of items. On the other hand, partial or incomplete rankings induce more difficulties since approaches that adapt well to typical types of rankings cannot apply generally to all types. While analyzing ranking data has a long history in statistics, construction of an efficient framework to analyze incomplete ranking data (with or without ties) is currently an open problem. This thesis addresses the problem of scalability for visualizing and modeling partial incomplete rankings. In particular, we propose a distance measure for top-k rankings with the following three properties: (1) metric, (2) emphasis on top ranks, and (3) computational efficiency. Given the distance measure, the data can be projected into a low dimensional continuous vector space via multi-dimensional scaling (MDS) for easy visualization. We further propose a non-parametric model for estimating distributions of partial incomplete rankings. For the non-parametric estimator, we use a triangular kernel that is a direct analogue of the Euclidean triangular kernel. The computational difficulties for large n are simplified using combinatorial properties and generating functions associated with symmetric groups. We show that our estimator is computational efficient for rankings of arbitrary incompleteness and tie structure. Moreover, we propose an efficient learning algorithm to construct a preference elicitation system from partial incomplete rankings, which can be used to solve the cold-start problems in ranking recommendations. The proposed approaches are examined in experiments with real search engine and movie recommendation data.
  • Item
    Tightening and blending subject to set-theoretic constraints
    (Georgia Institute of Technology, 2012-05-17) Williams, Jason Daniel
    Our work applies techniques for blending and tightening solid shapes represented by sets. We require that the output contain one set and exclude a second set, and then we optimize the boundary separating the two sets. Working within that framework, we present mason, tightening, tight hulls, tight blends, and the medial cover, with details for implementation. Mason uses opening and closing techniques from mathematical morphology to smooth small features. By contrast, tightening uses mean curvature flow to minimize the measure of the boundary separating the opening of the interior of the closed input set from the opening of its complement, guaranteeing a mean curvature bound. The tight hull offers a significant generalization of the convex hull subject to volumetric constraints, introducing developable boundary patches connecting the constraints. Tight blends then use opening to replicate some of the behaviors from tightenings by applying tight hulls. The medial cover provides a means for adjusting the topology of a tight hull or tight blend, and it provides an implementation technique for two-dimensional polygonal inputs. Collectively, we offer applications for boundary estimation, three-dimensional solid design, blending, normal field simplification, and polygonal repair. We consequently establish the value of blending and tightening as tools for solid modeling.
  • Item
    A distributed kernel summation framework for machine learning and scientific applications
    (Georgia Institute of Technology, 2012-05-11) Lee, Dong Ryeol
    The class of computational problems I consider in this thesis share the common trait of requiring consideration of pairs (or higher-order tuples) of data points. I focus on the problem of kernel summation operations ubiquitous in many data mining and scientific algorithms. In machine learning, kernel summations appear in popular kernel methods which can model nonlinear structures in data. Kernel methods include many non-parametric methods such as kernel density estimation, kernel regression, Gaussian process regression, kernel PCA, and kernel support vector machines (SVM). In computational physics, kernel summations occur inside the classical N-body problem for simulating positions of a set of celestial bodies or atoms. This thesis attempts to marry, for the first time, the best relevant techniques in parallel computing, where kernel summations are in low dimensions, with the best general-dimension algorithms from the machine learning literature. We provide a unified, efficient parallel kernel summation framework that can utilize: (1) various types of deterministic and probabilistic approximations that may be suitable for both low and high-dimensional problems with a large number of data points; (2) indexing the data using any multi-dimensional binary tree with both distributed memory (MPI) and shared memory (OpenMP/Intel TBB) parallelism; (3) a dynamic load balancing scheme to adjust work imbalances during the computation. I will first summarize my previous research in serial kernel summation algorithms. This work started from Greengard/Rokhlin's earlier work on fast multipole methods for the purpose of approximating potential sums of many particles. The contributions of this part of this thesis include the followings: (1) reinterpretation of Greengard/Rokhlin's work for the computer science community; (2) the extension of the algorithms to use a larger class of approximation strategies, i.e. probabilistic error bounds via Monte Carlo techniques; (3) the multibody series expansion: the generalization of the theory of fast multipole methods to handle interactions of more than two entities; (4) the first O(N) proof of the batch approximate kernel summation using a notion of intrinsic dimensionality. Then I move onto the problem of parallelization of the kernel summations and tackling the scaling of two other kernel methods, Gaussian process regression (kernel matrix inversion) and kernel PCA (kernel matrix eigendecomposition). The artifact of this thesis has contributed to an open-source machine learning package called MLPACK which has been first demonstrated at the NIPS 2008 and subsequently at the NIPS 2011 Big Learning Workshop. Completing a portion of this thesis involved utilization of high performance computing resource at XSEDE (eXtreme Science and Engineering Discovery Environment) and NERSC (National Energy Research Scientific Computing Center).
  • Item
    Predictive models for online human activities
    (Georgia Institute of Technology, 2012-04-04) Yang, Shuang-Hong
    The availability and scale of user generated data in online systems raises tremendous challenges and opportunities to analytic study of human activities. Effective modeling of online human activities is not only fundamental to the understanding of human behavior, but also important to the online industry. This thesis focuses on developing models and algorithms to predict human activities in online systems and to improve the algorithmic design of personalized/socialized systems (e.g., recommendation, advertising, Web search systems). We are particularly interested in three types of online user activities, i.e., decision making, social interactions and user-generated contents. Centered around these activities, the thesis focuses on three challenging topics: 1. Behavior prediction, i.e., predicting users' online decisions. We present Collaborative-Competitive Filtering, a novel game-theoretic framework for predicting users' online decision making behavior and leverage the knowledge to optimize the design of online systems (e.g., recommendation systems) in respect of certain strategic goals (e.g., sales revenue, consumption diversity). 2. Social contagion, i.e., modeling the interplay between social interactions and individual behavior of decision making. We establish the joint Friendship-Interest Propagation model and the Behavior-Relation Interplay model, a series of statistical approaches to characterize the behavior of individual user's decision making, the interactions among socially connected users, and the interplay between these two activities. These techniques are demonstrated by applications to social behavior targeting. 3. Content mining, i.e., understanding user generated contents. We propose the Topic-Adapted Latent Dirichlet Allocation model, a probabilistic model for identifying a user's hidden cognitive aspects (e.g., knowledgability) from the texts created by the user. The model is successfully applied to address the challenge of ``language gap" in medical information retrieval.
  • Item
    Nonnegative matrix and tensor factorizations, least squares problems, and applications
    (Georgia Institute of Technology, 2011-11-14) Kim, Jingu
    Nonnegative matrix factorization (NMF) is a useful dimension reduction method that has been investigated and applied in various areas. NMF is considered for high-dimensional data in which each element has a nonnegative value, and it provides a low-rank approximation formed by factors whose elements are also nonnegative. The nonnegativity constraints imposed on the low-rank factors not only enable natural interpretation but also reveal the hidden structure of data. Extending the benefits of NMF to multidimensional arrays, nonnegative tensor factorization (NTF) has been shown to be successful in analyzing complicated data sets. Despite the success, NMF and NTF have been actively developed only in the recent decade, and algorithmic strategies for computing NMF and NTF have not been fully studied. In this thesis, computational challenges regarding NMF, NTF, and related least squares problems are addressed. First, efficient algorithms of NMF and NTF are investigated based on a connection from the NMF and the NTF problems to the nonnegativity-constrained least squares (NLS) problems. A key strategy is to observe typical structure of the NLS problems arising in the NMF and the NTF computation and design a fast algorithm utilizing the structure. We propose an accelerated block principal pivoting method to solve the NLS problems, thereby significantly speeding up the NMF and NTF computation. Implementation results with synthetic and real-world data sets validate the efficiency of the proposed method. In addition, a theoretical result on the classical active-set method for rank-deficient NLS problems is presented. Although the block principal pivoting method appears generally more efficient than the active-set method for the NLS problems, it is not applicable for rank-deficient cases. We show that the active-set method with a proper starting vector can actually solve the rank-deficient NLS problems without ever running into rank-deficient least squares problems during iterations. Going beyond the NLS problems, it is presented that a block principal pivoting strategy can also be applied to the l1-regularized linear regression. The l1-regularized linear regression, also known as the Lasso, has been very popular due to its ability to promote sparse solutions. Solving this problem is difficult because the l1-regularization term is not differentiable. A block principal pivoting method and its variant, which overcome a limitation of previous active-set methods, are proposed for this problem with successful experimental results. Finally, a group-sparsity regularization method for NMF is presented. A recent challenge in data analysis for science and engineering is that data are often represented in a structured way. In particular, many data mining tasks have to deal with group-structured prior information, where features or data items are organized into groups. Motivated by an observation that features or data items that belong to a group are expected to share the same sparsity pattern in their latent factor representations, We propose mixed-norm regularization to promote group-level sparsity. Efficient convex optimization methods for dealing with the regularization terms are presented along with computational comparisons between them. Application examples of the proposed method in factor recovery, semi-supervised clustering, and multilingual text analysis are presented.
  • Item
    An integrative framework of time-varying affective robotic behavior
    (Georgia Institute of Technology, 2011-04-04) Moshkina, Lilia V.
    As robots become more and more prevalent in our everyday life, making sure that our interactions with them are natural and satisfactory is of paramount importance. Given the propensity of humans to treat machines as social actors, and the integral role affect plays in human life, providing robots with affective responses is a step towards making our interaction with them more intuitive. To the end of promoting more natural, satisfying and effective human-robot interaction and enhancing robotic behavior in general, an integrative framework of time-varying affective robotic behavior was designed and implemented on a humanoid robot. This psychologically inspired framework (TAME) encompasses 4 different yet interrelated affective phenomena: personality Traits, affective Attitudes, Moods and Emotions. Traits determine consistent patterns of behavior across situations and environments and are generally time-invariant; attitudes are long-lasting and reflect likes or dislikes towards particular objects, persons, or situations; moods are subtle and relatively short in duration, biasing behavior according to favorable or unfavorable conditions; and emotions provide a fast yet short-lived response to environmental contingencies. The software architecture incorporating the TAME framework was designed as a stand-alone process to promote platform-independence and applicability to other domains. In this dissertation, the effectiveness of affective robotic behavior was explored and evaluated in a number of human-robot interaction studies with over 100 participants. In one of these studies, the impact of Negative Mood and emotion of Fear was assessed in a mock-up search-and-rescue scenario, where the participants found the robot expressing affect more compelling, sincere, convincing and "conscious" than its non-affective counterpart. Another study showed that different robotic personalities are better suited for different tasks: an extraverted robot was found to be more welcoming and fun for a task as a museum robot guide, where an engaging and gregarious demeanor was expected; whereas an introverted robot was rated as more appropriate for a problem solving task requiring concentration. To conclude, multi-faceted robotic affect can have far-reaching practical benefits for human-robot interaction, from making people feel more welcome where gregariousness is expected to making unobtrusive partners for problem solving tasks to saving people's lives in dangerous situations.