Series
Doctor of Philosophy with a Major in Computer Science

Series Type
Degree Series
Description
Associated Organization(s)
Associated Organization(s)

Publication Search Results

Now showing 1 - 10 of 68
  • Item
    System Support for Fine-grained Resource Management in Mobile Edge Computing
    (Georgia Institute of Technology, 2024-05-16) Hsu, Ke-Jou
    Multi-access edge computing (MEC) systems, similar to cloud systems, offer advantages such as multi-tenancy, fast delivery, and pay-as-you-go models. However, the limited capacity at each edge site, the collocated workloads’ stringent latency-centric performance requirements, and the heterogeneous nature of the edge, present limitations for cloud-native resource management solutions. This thesis demonstrates these limitations and addresses them via new systems support for faster and more cost-effective resource management for MEC. One limitation is the mismatch between the resource requirements for certain edge applications and the resources available at an edge site. To address this, in this thesis we first develop Couper – systems support for decomposing resource-intensive video analytics applications based on Deep Neural Networks (DNN) into finer-granular components, allowing resource management to balance the DNN inference load between the edge and the cloud, and to improve end-to-end performance. In addition, we demonstrate the importance of careful placement of components across the edge-cloud continuum. For a concrete example of a Content Delivery Network (CDN), we show that by managing the placement and collocation of components in MEC-CDN can lead to average latency reduction of 75% compared to existing solutions. We generalize the methodology used to establish this observation and develop Anitya – lightweight systems support for capturing cross-component dependencies that enables effective management of componentized microservice-based MEC application deployments. A second limitation is the mismatch among the resource allocation granularity of current MEC platforms vs. what is needed for emerging MEC workloads. We show that this gap can completely eliminate any expected edge benefits in multi-tenant settings. To address this, we develop ShapeShifter – systems support for fine-grained software-level traffic controls that augment the underlying platform capabilities to specialize the resource allocations on workload granularity. This prevents hidden congestion problems and provides 4× improvements in application performance. A third limitation is related to the mismatch among the time granularities at which cloud-native resource management operates vs. what is needed in MEC. Naive adjustments of cloud-native systems lead to prohibitive resource overheads for resource-limited MEC environments. As part of Colibri – a new observability tool for MEC, we develop new systems support for dynamically controlling and specializing the execution of control plane functionality needed for resource management, focused on resource monitoring in this case. The evaluations of the different systems developed as part of this thesis, performed using new experimental testbeds and MEC benchmarks, demonstrate that the new systems support enables improvements in the effectiveness of different resource management tasks which span the entire lifecycle of MEC application and service deployments, and results in improvements in end-to-end application performance and infrastructure efficiency.
  • Item
    Audio-Visual Scene-Aware Dialog: A Step Towards Multimodal Conversational Agents
    (Georgia Institute of Technology, 2024-05-02) Alamri, Huda Abdulhadi D Abdulhadi
    Developing conversational agents has long been a goal in artificial intelligence, with extensive applications in human-computer interaction and virtual assistance. In an effort to advance natural conversation with these agents, this dissertation introduces the Audio Visual Scene-Aware Dialog task. This task involves generating coherent responses to questions about a dynamic scene, leveraging video and audio inputs alongside the dialog history. To facilitate benchmarking of this task, the Audio Visual Scene-Aware Dialog (AVSD) dataset is introduced. AVSD is a challenging task in visual-and-language joint representation learning. It has gained significant attention in recent years, alongside other tasks such as text-based video retrieval and video summarization, with notable progress driven by transformer-based language encoders. However, existing approaches often underutilize the visual features, leading to biases towards textual information. To address this limitation, this dissertation introduces a novel framework for training the multimodal networks, integrating 3D-CNN network and transformer-based networks to enhance the extraction of semantic representations from videos and improve performance in both generative and retrieval tasks. Finally, this dissertation leverages recent advancements in ego-centric pretraining and the introduction of ego-centric datasets to enhance multimodal representation learning. It introduces a novel framework, the Hierarchical Multimodal Attention Network (HMAM). HMAM adopts a hierarchical approach to jointly learn audiovisual and textual representations, effectively capturing the sequence of events within a video and improving understanding across modalities.
  • Item
    Robust Learning in High Dimension
    (Georgia Institute of Technology, 2024-04-30) Jia, He
    This thesis focuses on designing efficient and robust algorithms for learning high-dimensional distributions, particularly in the presence of adversarial corruptions. We addresse existing challenges in robust learning in high dimension and establish new connections between robust statistics and algorithmic solutions. In particular, we give polynomial-time algorithms for the following two open problems: robust learning of Gaussian Mixture Models (GMMs) and robust learning of affine transformations of hypercubes. For GMMs, we give a polynomial-time algorithm for robustly learning a mixture of $k$ arbitrary Gaussians in $\R^d$, for any fixed $k$, in the presence of a constant fraction of arbitrary corruptions. This resolves the main open problem in several previous works on algorithmic robust statistics, which addressed the special cases of robustly estimating (a) a single Gaussian, (b) a mixture of TV-distance separated Gaussians, and (c) a uniform mixture of two Gaussians. Our main tools are an efficient partial clustering algorithm that relies on the sum-of-squares method (building on previous work), and a novel tensor decomposition algorithm that allows errors in both Frobenius norm and low-rank terms. The second contribution is a polynomial-time algorithm for robustly learning an unknown affine transformation of the standard hypercube from samples, an important and well-studied setting for independent component analysis (ICA). Our algorithm, based on a new method that goes beyond the limitations of the method of moments, achieves asymptotically optimal error in total variation distance. Lastly, we introduce a faster algorithm for the isotropic transformation of convex bodies, whose complexity is directly tied to the KLS (Kannan-Lovász-Simonovits) constant. This serves as a useful tool to high-dimensional learning and volume computation.
  • Item
    Sim2Robot: Training Robots for the Real-World with Imperfect Simulators
    (Georgia Institute of Technology, 2024-04-29) Truong, Joanne
    The goal of Artificial Intelligence is to “construct useful intelligent systems”, such as mobile robots, to assist in our day-to-day lives. For these mobile robotic assistants to be useful in the real-world, they must skillfully navigate complex environments (e.g., delivering packages from one building to another). However, training robots in the real-world can be slow, dangerous, expensive, and difficult to reproduce. Thus, one paradigm in robot learning is to leverage simulation for training robots (where gathering experience is scalable, safe, cheap, and reproducible) before being deployed in the real world. However, no simulator is perfect; AI systems learn to “cheat” by exploiting imperfections. Thus, how can we train robots in imperfect simulators while ensuring that the learned skills generalize to reality? In this thesis, we will argue that simulators need not be perfect to be useful; they don’t need to model everything about the world, only what’s necessary for generalization. We present 1) Sim2Real Correlation Coefficient for measuring and optimizing performance correlation between simulation and reality, enabling confident evaluation. 2) Bi-directional Domain Adaptation (BDA), and Kinematic-to-Dynamic Transfer (Kin2Dyn), sample-efficient methods for reducing the sim2real gap. BDA and Kin2Dyn improve robot learning and generalization to the real-world by utilizing abstracted physics and simple adaptation models learned from small amounts of real-world data. 3) IndoorSim-to-OutdoorReal, an end-to-end learned approach that enables visual navigation in out-of-distribution environments zero-shot. We show that simulators can be used for real-world transfer without having to apriori design and model the deployment scenario. 4) Implicit Map Cross Modal Attention, a vision and language navigation model that utilizes structured implicit maps for navigating in an environment over time. Structured memory representation and training paradigms enable navigation for robots that occupy the same environment for long periods of time.
  • Item
    Bias in the Eyes of the Beholder: Development of a Bias-Aware Facial Expression Recognition Algorithm for Autonomous Agents
    (Georgia Institute of Technology, 2024-04-28) Bryant, De'aira Gladys
    The field of human-robot interaction (HRI) is advancing rapidly, with autonomous agents becoming more pervasive across various sectors including education, healthcare, hospitality, and more. However, the susceptibility of data-driven perception algorithms to bias raises ethical concerns around fairness, inclusivity, and responsibility. Measuring and mitigating bias in AI systems have emerged as vital challenges facing the computing community. Yet, little attention has been given to its impact on autonomous agents and HRI. This thesis investigates the effects of human, data, and algorithmic bias on HRI through the lens of facial expression recognition (FER). Beginning with an analysis of human bias, we explore normative expectations for expressive autonomous agents when considering factors like robot race, gender, and embodiment. These expectations help inform the design processes needed to develop effective agents. We then delve into data and algorithmic bias in FER algorithms, addressing challenges associated with some populations having scarce data. We contribute improved techniques for modeling facial expression perception and benchmarking FER algorithms. Central to our contributions is the introduction of a bias-aware approach for FER algorithmic development, leveraging self-training semi-supervised learning coupled with random class rebalancing. This novel approach enhances the performance and equity of FER algorithms. Validation experiments, conducted online and in-person, assess the impact of our bias-aware FER algorithm on human perceptions of fairness during interactions with autonomous agents. Our results demonstrate that users interacting with bias-free or bias-aware agents exhibit higher perceptions of fairness and enhanced responsiveness. These findings collectively highlight the importance of addressing bias in AI systems to promote positive interactions between humans and embodied agents, facilitated by bias-aware technology, algorithmic transparency, and AI literacy.
  • Item
    Towards Fine-grained Multi-Attribute Control using Language Models
    (Georgia Institute of Technology, 2024-04-27) Baheti, Ashutosh
    As we increasingly rely on powerful language models, ensuring their safe and effective operation necessitates extensive research in controllable text generation. Existing state-of-the-art language models struggle to generate the most accurate or desired output at the first attempt. Inspired by recent developments in self-correction in large language models and new reinforcement learning methods, we aim to train smaller language models as fine-grained editors, whereby they iteratively edit outputs to satisfy threshold constraints over multiple classifier-based attributes. In this thesis, I show a study of contextual offensive behavior of pretrained large language models and curate a high-quality dataset for toxicity detection. Next, I introduce a novel offline RL algorithm that can utilize arbitrary numeric scores as rewards during training to optimize any user-desired LM behavior by filtering out suboptimal data. Finally, I designed an offline RL framework, I propose a fine-grained multi-attribute controllability task, where the goal is to guide the language model to generate output sequences that satisfy user-defined threshold-based attribute constraints. The LM model can take multiple edits to reach the desired attributes. Experiments on both languages and proteins demonstrate the versatility and effectiveness of our approach.
  • Item
    Designing Collective Action Systems for User Privacy
    (Georgia Institute of Technology, 2024-04-27) Wu, Yuxi
    People feel concerned, angry, and frustrated when subjected to data breaches, surveillance, and other privacy-violating experiences with large institutions. However, they also feel helpless to effect change. Collective action may empower groups of people affected by such experiences to jointly voice their stories of lived harm and demand redress. In this thesis, I show that considering users’ privacy concerns and lived harms on a collective level can empower users through allowing them to (1) understand they are not alone in their experiences; (2) recognize that their harms are significant and measurable; and (3) be equipped with the appropriate tools to regularly speak out about these harms. I do this through a series of work in which I create a unified collective voice of privacy concerns, interpret the unified voice in existing legal lenses of harm, and imagine formal ways to measure and respond to privacy harms. Reflecting upon my findings from this work, I discuss how the current lack of a collective action framing within the usable privacy and security field has led to the community not addressing multiple long-standing problems, and how my work can inform future directions of research in the field.
  • Item
    “I Can Help with That” - Designing Flexible, Personalized and Proactive Conversational Agents for Older Adults
    (Georgia Institute of Technology, 2024-04-27) Zubatiy, Tamara
    Observing how older adults interact with commercial conversational agents reveals limitations in existing CA systems and contributes to design guidelines for future systems to overcome these limitations. Despite widespread adoption and study of pre-large-language model conversational agents (CAs), they are still error prone and frustrating due to lack of proactivity, personalization over time and inflexibility of input. Importantly, while they stand to greatly benefit older adults, through on-demand support that doesn’t require interacting with a touch screen, limited research has been done to explore how older adults use CAs over time at home. To contribute to this opportunity at the intersection of technology and health, my research combines user insights gleaned from multiple longitudinal deployments of commercial CAs into the homes of older adults, a battery of qualitative & quantitative statistical analyses, and an understanding of optimal training approaches designed specifically for older adults experiencing age related decline. My research contextualizes the usage patterns & frustrations of older adults into existing literature on CA usage by other populations, reveals clear impacts of cognitive status on CA usage by older adults and points to the power of training to increase CA interaction rigor. It also highlights three specific limitations in existing commercial CAs and points to tradeoffs that future systems, likely powered in part by large language models, will need to address in order to overcome them. This work is interwoven with a privacy-focused component including privacy impact assessments (PIA) two existing and one developing CA systems. Each PIA weighs the impact of these three tradeoffs on future CAs and provides actionable recommendations for privacy preserving future systems that still deliver on the personalized and proactive promise of future CAs.
  • Item
    Investigation of Wellbeing and Situation Awareness in Virtual Reality
    (Georgia Institute of Technology, 2024-04-27) Fereydooni, Nadia
    As our relationship with technology continues to deepen, wellbeing considerations need to be front and center. Failure to do so has real consequences, both on societal and regulatory fronts. While the relationship between wellbeing and common digital technologies is extensively studied, there's a notable gap in research concerning Virtual Reality (VR), despite its known effects on user wellbeing. Although VR devices are designed with safety, productivity, and enjoyment in mind, assuming that these aspects naturally lead to higher wellbeing and happier lives is inadequate. This becomes more urgent as people spend more time in virtual environments, especially with the growing adoption of VR devices in various contexts like home and office settings. However, potential harms, intentional or unintentional, cannot be overlooked. These harms include users being distressed due to not being aware of the real world while immersed in the virtual. Providing users with real-world information in the virtual environment is proposed as a solution, but the impact of this information on user experience remains unknown. This thesis investigates how providing real-world information affects users' virtual experience and behavior, addressing a significant gap in our understanding within the VR field.
  • Item
    Designing for the in-between: balancing organizational goals with worker wants and needs
    (Georgia Institute of Technology, 2024-04-26) Sheehan, Alyssa
    Rapid advances in computing are transforming the social and human dimensions of frontline work, from back-office process-driven systems to specialized hardware for monitoring and sensing. In blue collar domains these types of data driven systems are augmenting labor in entirely new ways making it imperative to understand how technology is transforming workplace practices far afield from the office environments historically at the center of HCI and design. For my dissertation, I conducted a cross site comparative analysis of three completed case studies in blue collar domains that examined how emerging data driven technologies are changing the nature of work. In each case study I applied participatory- and human- centered design approaches to develop and deploy worker interventions in the field. Outcomes of my comparative analysis highlight what it means to be a modern blue-collar worker upending traditional stereotypes of technological resistance and tracing how workers and organizations respond to the production of new types of data resulting from new emerging data driven technologies. The assumption that data is free and available for organizations to leverage is fundamentally redefining an entire class of labor removing the legitimacy of performing manual work. Frontline workers are being positioned as data producers upending notions of expertise and shifting task jurisdictions. This aligns with valorized principles of computing that manual labor is no longer necessary nor desirable yet there are whole categories of society where personal and professional meaning are derived from manual labor. It is these assumptions that have led to mismatched technology interventions that continue to perpetuate social injustices influencing our perception of who and what actions are deemed valuable. By articulating the tensions between worker wants and needs and organizational goals I seek to advocate for workers and create space for new methods and approaches to worker centered design and data activism in blue collar domains.