Series
Doctor of Philosophy with a Major in Computer Science

Series Type
Degree Series
Description
Associated Organization(s)
Associated Organization(s)

Publication Search Results

Now showing 1 - 10 of 16
  • Item
    System Support for Fine-grained Resource Management in Mobile Edge Computing
    (Georgia Institute of Technology, 2024-05-16) Hsu, Ke-Jou
    Multi-access edge computing (MEC) systems, similar to cloud systems, offer advantages such as multi-tenancy, fast delivery, and pay-as-you-go models. However, the limited capacity at each edge site, the collocated workloads’ stringent latency-centric performance requirements, and the heterogeneous nature of the edge, present limitations for cloud-native resource management solutions. This thesis demonstrates these limitations and addresses them via new systems support for faster and more cost-effective resource management for MEC. One limitation is the mismatch between the resource requirements for certain edge applications and the resources available at an edge site. To address this, in this thesis we first develop Couper – systems support for decomposing resource-intensive video analytics applications based on Deep Neural Networks (DNN) into finer-granular components, allowing resource management to balance the DNN inference load between the edge and the cloud, and to improve end-to-end performance. In addition, we demonstrate the importance of careful placement of components across the edge-cloud continuum. For a concrete example of a Content Delivery Network (CDN), we show that by managing the placement and collocation of components in MEC-CDN can lead to average latency reduction of 75% compared to existing solutions. We generalize the methodology used to establish this observation and develop Anitya – lightweight systems support for capturing cross-component dependencies that enables effective management of componentized microservice-based MEC application deployments. A second limitation is the mismatch among the resource allocation granularity of current MEC platforms vs. what is needed for emerging MEC workloads. We show that this gap can completely eliminate any expected edge benefits in multi-tenant settings. To address this, we develop ShapeShifter – systems support for fine-grained software-level traffic controls that augment the underlying platform capabilities to specialize the resource allocations on workload granularity. This prevents hidden congestion problems and provides 4× improvements in application performance. A third limitation is related to the mismatch among the time granularities at which cloud-native resource management operates vs. what is needed in MEC. Naive adjustments of cloud-native systems lead to prohibitive resource overheads for resource-limited MEC environments. As part of Colibri – a new observability tool for MEC, we develop new systems support for dynamically controlling and specializing the execution of control plane functionality needed for resource management, focused on resource monitoring in this case. The evaluations of the different systems developed as part of this thesis, performed using new experimental testbeds and MEC benchmarks, demonstrate that the new systems support enables improvements in the effectiveness of different resource management tasks which span the entire lifecycle of MEC application and service deployments, and results in improvements in end-to-end application performance and infrastructure efficiency.
  • Item
    Audio-Visual Scene-Aware Dialog: A Step Towards Multimodal Conversational Agents
    (Georgia Institute of Technology, 2024-05-02) Alamri, Huda Abdulhadi D Abdulhadi
    Developing conversational agents has long been a goal in artificial intelligence, with extensive applications in human-computer interaction and virtual assistance. In an effort to advance natural conversation with these agents, this dissertation introduces the Audio Visual Scene-Aware Dialog task. This task involves generating coherent responses to questions about a dynamic scene, leveraging video and audio inputs alongside the dialog history. To facilitate benchmarking of this task, the Audio Visual Scene-Aware Dialog (AVSD) dataset is introduced. AVSD is a challenging task in visual-and-language joint representation learning. It has gained significant attention in recent years, alongside other tasks such as text-based video retrieval and video summarization, with notable progress driven by transformer-based language encoders. However, existing approaches often underutilize the visual features, leading to biases towards textual information. To address this limitation, this dissertation introduces a novel framework for training the multimodal networks, integrating 3D-CNN network and transformer-based networks to enhance the extraction of semantic representations from videos and improve performance in both generative and retrieval tasks. Finally, this dissertation leverages recent advancements in ego-centric pretraining and the introduction of ego-centric datasets to enhance multimodal representation learning. It introduces a novel framework, the Hierarchical Multimodal Attention Network (HMAM). HMAM adopts a hierarchical approach to jointly learn audiovisual and textual representations, effectively capturing the sequence of events within a video and improving understanding across modalities.
  • Item
    Robust Learning in High Dimension
    (Georgia Institute of Technology, 2024-04-30) Jia, He
    This thesis focuses on designing efficient and robust algorithms for learning high-dimensional distributions, particularly in the presence of adversarial corruptions. We addresse existing challenges in robust learning in high dimension and establish new connections between robust statistics and algorithmic solutions. In particular, we give polynomial-time algorithms for the following two open problems: robust learning of Gaussian Mixture Models (GMMs) and robust learning of affine transformations of hypercubes. For GMMs, we give a polynomial-time algorithm for robustly learning a mixture of $k$ arbitrary Gaussians in $\R^d$, for any fixed $k$, in the presence of a constant fraction of arbitrary corruptions. This resolves the main open problem in several previous works on algorithmic robust statistics, which addressed the special cases of robustly estimating (a) a single Gaussian, (b) a mixture of TV-distance separated Gaussians, and (c) a uniform mixture of two Gaussians. Our main tools are an efficient partial clustering algorithm that relies on the sum-of-squares method (building on previous work), and a novel tensor decomposition algorithm that allows errors in both Frobenius norm and low-rank terms. The second contribution is a polynomial-time algorithm for robustly learning an unknown affine transformation of the standard hypercube from samples, an important and well-studied setting for independent component analysis (ICA). Our algorithm, based on a new method that goes beyond the limitations of the method of moments, achieves asymptotically optimal error in total variation distance. Lastly, we introduce a faster algorithm for the isotropic transformation of convex bodies, whose complexity is directly tied to the KLS (Kannan-Lovász-Simonovits) constant. This serves as a useful tool to high-dimensional learning and volume computation.
  • Item
    Bias in the Eyes of the Beholder: Development of a Bias-Aware Facial Expression Recognition Algorithm for Autonomous Agents
    (Georgia Institute of Technology, 2024-04-28) Bryant, De'aira Gladys
    The field of human-robot interaction (HRI) is advancing rapidly, with autonomous agents becoming more pervasive across various sectors including education, healthcare, hospitality, and more. However, the susceptibility of data-driven perception algorithms to bias raises ethical concerns around fairness, inclusivity, and responsibility. Measuring and mitigating bias in AI systems have emerged as vital challenges facing the computing community. Yet, little attention has been given to its impact on autonomous agents and HRI. This thesis investigates the effects of human, data, and algorithmic bias on HRI through the lens of facial expression recognition (FER). Beginning with an analysis of human bias, we explore normative expectations for expressive autonomous agents when considering factors like robot race, gender, and embodiment. These expectations help inform the design processes needed to develop effective agents. We then delve into data and algorithmic bias in FER algorithms, addressing challenges associated with some populations having scarce data. We contribute improved techniques for modeling facial expression perception and benchmarking FER algorithms. Central to our contributions is the introduction of a bias-aware approach for FER algorithmic development, leveraging self-training semi-supervised learning coupled with random class rebalancing. This novel approach enhances the performance and equity of FER algorithms. Validation experiments, conducted online and in-person, assess the impact of our bias-aware FER algorithm on human perceptions of fairness during interactions with autonomous agents. Our results demonstrate that users interacting with bias-free or bias-aware agents exhibit higher perceptions of fairness and enhanced responsiveness. These findings collectively highlight the importance of addressing bias in AI systems to promote positive interactions between humans and embodied agents, facilitated by bias-aware technology, algorithmic transparency, and AI literacy.
  • Item
    Leveraging machine learning for enhancing code performance and programming productivity
    (Georgia Institute of Technology, 2024-04-25) Ye, Fangke
    As hardware performance continues to improve with the increase of hardware complexity and diversification, software struggles to keep up and fully realize these performance gains. Only a handful of expert programmers can harness the full potential of modern hardware using hardware-exposed low-level programming primitives. Meanwhile, the widespread adoption of high-level dynamically-typed programming languages like Python and JavaScript provides high productivity but suffers from low performance due to the lack of static type information necessary for compiler optimizations. Therefore, it becomes increasingly difficult to enable the development of high-performance programs capable of utilizing the potential performance provided by evolving hardware while maintaining high programming productivity for mass developers. This thesis proposes the use of machine learning to enhance both programming productivity and code performance. First, we present a neural network based system that can compute code-semantics similarity in C/C++ code, with the goal of identifying semantically equivalent high-performance code for a given low-performance input code; this approach incorporates a context-aware semantics structure and an extensible neural code similarity scoring algorithm. Then, we show how a graph-based deep learning type inference method can be used to infer types in JavaScript to help productivity; our approach employs multiple graph neural network models and a novel type flow graph representation to infer types in dynamically-typed languages without manual annotations. Finally, we demonstrate a new approach to concrete type inference for Python programs, enabling ahead-of-time code optimization for dynamically-typed languages by combining machine learning and SMT solving without requiring programmers to provide any type annotation.
  • Item
    Human-Aware Artificial Intelligence Procedural Content Generation
    (Georgia Institute of Technology, 2024-04-25) Lin, Zhiyu
    Although recent advancements in Machine Learning (ML)-based Artificial Intelligence (AI) generative models enabled a new generation of Computational Creative capabilities unimaginable before, many of them are AI-centric, barring many human creators without an in-depth understanding of these AI models from building effective communications between them and the systems, and utilizing both the expertise of their own and the AI models. My research focuses on Human-Aware Artificial Intelligence Procedural Content Generation (PCG), which centers on empowering creator-aware ways to carry out Procedural Content Generation tasks, enabling more creator-aware information exchange between a human creator and the AI, and abilities for the AI agent to adapt to the specific human creator while collaborating on the fly. In this dissertation, I begin with a discussion of what Computational Creativity means to the human-AI collaborative partnership by illustrating the diversity of Co-creative systems and sketching out the fundamentals of my work. I then present case studies of AI PCG systems utilizing both high-level and fine-grained control knobs with an awareness of the human creative process in mind. Developing on these studies, I cast the spotlight onto Creative-Wand, the toolbox I developed to explore the design space of interactions for Mixed-Initiative Co-Creative (MI-CC) systems, and the benefits of MI-CC systems covering larger portions of the design space. In light of these findings, I demonstrate that human-in-the-loop Reinforcement Learning (RL) can enable human awareness of MI-CC collaborative systems, going beyond controlled generation, learning collaborative delegations, and improving overall experiences.
  • Item
    Adaptive and Intelligent Battery-free Computing Systems: Platforms, Runtime Systems, and Tools
    (Georgia Institute of Technology, 2024-04-17) Bakar, Abu
    Energy-harvesting, battery-free devices enable deployment in new applications with their promise of zero maintenance and long lifetimes. A core challenge for these devices is maintaining usefulness despite varying energy availability, which causes power failures, loss of progress, and inconsistent execution. While prior research has enabled basic operability through power interruptions, performance optimization has often been overlooked, especially for emerging data-intensive Machine Learning applications that require on-device processing. This dissertation fills this gap by reimagining different components of the battery-free system stack. It introduces: i) runtime systems that equip batteryless applications with adaptability, ensuring they can operate effectively with higher throughputs in varying energy harvesting conditions, ii) reconfigurable and heterogeneous hardware platforms that enable the use of battery-free systems for modern compute-intensive inference-based applications, iii) novel ML algorithms for on-device energy-efficient inferences, and iv) user-facing tools and interfaces to streamline the development process of battery-free applications. In particular, this dissertation presents three systems created to imbue adaptability and intelligence into battery-free applications through energy-efficient software and novel hardware designs. Firstly, REHASH, a hardware-independent runtime system that uses lightweight signals and heuristics to capture changes in energy harvesting conditions and trigger application-level adaptation to get higher throughput in low-energy scenarios. Secondly, Protean, an energy-efficient and heterogeneous platform for adaptive and hardware-accelerated battery-free computing. Protean is an end-to-end system that leverages recent advancements in heterogeneous computing architecture to enable the development of modern, data-intensive, inference-based applications. Lastly, Lite-TM, a novel framework that enables the deployment of Tsetlin Machine, a logic-based inference engine serving as an alternative to a Deep Neural Networks, on intermittently powered systems. Lite-TM employs advanced encoding schemes to compress TM models and incorporates three core techniques aimed at reducing the memory footprint of the trained TM models, accelerating model execution, and dynamically adjusting model complexity based on available energy. These systems and their companion tools unlock many new applications and empower developers to quickly design, debug, and deploy sustainable, energy-efficient, adaptive, and intelligent battery-free applications.
  • Item
    High-Level Compiler Optimizations for Python Programs
    (Georgia Institute of Technology, 2024-04-15) Zhou, Tong
    As Python becomes the de facto high-level programming language for many data analyt- ics and scientific computing domains, it becomes increasingly critical to build optimizing compilers that are able to generate efficient sequential and parallel code from Python pro- grams to keep up with the insatiable demands for performance in these domains. Programs written in high-level languages like Python often make extensive use of arrays as a core data type, and mathematical functions applied on the arrays, in conjunction with general loops and element-level array accesses. Such a programming style poses both challenges and opportunities for optimizing compilers. We recognize that current compilers are limited in their ability to make effective use of the high-level operator and loop semantics to generate efficient code on modern parallel architectures. This dissertation presents three pieces of work that demonstrate that compilers that leverage high-level operator and loop semantics can deliver improved performance for Python programs on CPUs and GPUs, relative to past work. On the CPU front, we present Intrepydd, a Python to C++ compiler that compiles a broad class of Python language constructs and NumPy array operators to sequential and parallel C++ code on CPUs. On the GPU front, we present APPy (Annotated Parallelism for Python), which enables users to parallelize generic Python loops and tensor expressions for execution on GPUs by simply adding compiler directives (annotations) to Python code. Then for programs consisting of sparse tensor operators, we introduce ReACT, which consists of a set of code generation techniques that achieve greater redundancy elimination than state-of-the-art.
  • Item
    High-Precision Ranging Matters: Uncovering the Potential of Ultra-Wideband Radios in Real-World Applications
    (Georgia Institute of Technology, 2024-04-15) Cao, Yifeng
    Ultra-wideband (UWB) radio is an upcoming wireless technology characterized by an extremely large bandwidth (> 500MHz). Such a wide band enables UWB to perform ranging in decimeter-level accuracy, making it a superior option for accurate localization. The recent incorporation of UWB in mobile devices like iPhones, Samsung smartphones and AirTags has demonstrated its feasibility in medium-range positioning. However, we believe the potential of UWB is still under-explored even in today's market for three reasons. First, accurate ranging measurement is an important modality in extensive sensing applications beyond localization, including physical distancing, human action recognition, autonomous parking, etc. Merely using UWB for object positioning ignores the vast possibilities to apply UWB in the general mobile computing field. Second, most current use cases of UWB focus on the baseband. The potential of UWB's carrier wave, and particularly its phase, has not been exploited. Third, as a radio technology working on an independent band, UWB does not interfere with other widely used wireless technologies, including Wi-Fi, Bluetooth, etc. The objective of this dissertation is to explore and extend the possibilities of UWB, enabling various applications. Our exploration demonstrates that introducing UWB can both achieve better performance in solving a problem which is traditionally tackled by other technologies, and open the gates to new applications. We proposed UWB's applications in four areas in this dissertation. In the first work 6Fit-a-Part, we use UWB to achieve accurate and real-time physical distancing using a custom wearable device. More specifically, we design a one-to-all ranging protocol that is able to accurately estimate the distance to neighboring devices and warn the user if the distance falls below a certain established threshold within a short time. This work still employs UWB's fundamental ranging capabilities, but targets a more challenging dynamic, multi-user ranging scenario. Our second work ITrackU goes one step further to answer the question whether UWB can be used to perform high-precision tracking. In this work, we present a system that enables millimeter-level tracking of a pen-like instrument across a large surface by fusing UWB with inertial sensors (IMU). The core idea that permits mm-level tracking is to use UWB carrier phase captured from multiple vantage points. Fusing the phase measurements with IMU offers the ability to perform continuous tracking despite wireless occlusions. Continuing on the UWB-IMU fusion approach, the third work ViSig extends the use of UWB to human action recognition with wearable devices. In this work, UWB primarily provides inter-appendage distance measurements while IMU captures the angles (or orientation) of different body segments. The results show that by deploying only a small number of sensors (6) on the body, we can achieve >90% accuracy in interpreting various body signal applications such as cricket umpire signals, baseball umpire signals, crane signals, flag semaphore, and football official signals. Finally, in the fourth work, we show UWB can even be applied in the online authentication field, where the location of a token close to the login device is an important consideration. We present a UWB-based two-factor authentication (2FA) platform, called UWB-Auth, designed as carriable or wearable devices, which eliminates various social engineering attacks including phishing attack, 2FA-fatigue attack, co-located attack, etc, while maintaining short authentication. The evaluation with our custom prototype shows UWB-Auth completes the whole authentication process in 4 seconds, and completely rejects malicious requests when the adversary is 20cm and 10 degrees outside a small valid physical area near the login device. Overall, we have significantly expanded the application space for UWB beyond the traditional indoor localization and lost-and-found use cases. In doing so, we have made algorithmic and architectural innovations which are expected to become cornerstones in future research around UWB.
  • Item
    User-centered Programmatic Data Labeling
    (Georgia Institute of Technology, 2024-04-02) Wu, Renzhi
    The lack of labeled training data is a major challenge impeding the practical application of machine learning (ML) techniques. Therefore, ML practitioners have increasingly turned to programmatic supervision methods, in which a larger volume of programmatically generated, but often noisier, labeled examples is used in lieu of hand-labeled examples. In this paradigm, supervision sources are expressed as labeling functions (LFs), and a label model aggregates the output of multiple LFs to produce training labels. However, the current process of developing LF relies on the expertise of the user and can be inaccessible for non-experts, particularly when dealing with video data. In addition, existing label models require hyperparameters and dataset-specific training for each dataset and can yield non-deterministic results, further complicating the process for non-expert users. This dissertation aims to improve the usability of programmatic data labeling through a three-part research approach. First, I explore how to improve usability by specializing programmatic data labeling to the task at hand. I examine a specific task (entity matching) as a case study to develop a specialized Integrated Development Environment, facilitating the development, debugging, aggregation, and management of LFs. Second, I extend the labeling function interface by introducing a visual interface, allowing users to create LFs for video data intuitively without any coding. Specifically, I propose a visual query language for retrieving video clips across datasets, enabling non-expert users to easily develop LFs with mouse drag-and-drop. Third, to obviate user involvement in the label model, I present a hyper label model that requires neither hyperparameters nor dataset-specific training, while producing deterministic results with superior accuracy and efficiency. The proposed method also offers the first analytical optimal solution to the problem.