Doctor of Philosophy with a Major in Computer Science

Series

Doctor of Philosophy with a Major in Computer Science

Permanent Link

https://hdl.handle.net/1853/72276

Series Type

Degree Series

Associated Organization(s)

Organizational Unit

School of Computer Science

Organizational Unit

School of Interactive Computing

Organizational Unit

School of Cybersecurity and Privacy

Organizational Unit

School of Computational Science and Engineering

Full item page

Publication Search Results

Now showing 1 - 10 of 17

Sim2Robot: Training Robots for the Real-World with Imperfect Simulators

(Georgia Institute of Technology, 2024-04-29) Truong, Joanne

The goal of Artificial Intelligence is to “construct useful intelligent systems”, such as mobile robots, to assist in our day-to-day lives. For these mobile robotic assistants to be useful in the real-world, they must skillfully navigate complex environments (e.g., delivering packages from one building to another). However, training robots in the real-world can be slow, dangerous, expensive, and difficult to reproduce. Thus, one paradigm in robot learning is to leverage simulation for training robots (where gathering experience is scalable, safe, cheap, and reproducible) before being deployed in the real world. However, no simulator is perfect; AI systems learn to “cheat” by exploiting imperfections. Thus, how can we train robots in imperfect simulators while ensuring that the learned skills generalize to reality? In this thesis, we will argue that simulators need not be perfect to be useful; they don’t need to model everything about the world, only what’s necessary for generalization. We present 1) Sim2Real Correlation Coefficient for measuring and optimizing performance correlation between simulation and reality, enabling confident evaluation. 2) Bi-directional Domain Adaptation (BDA), and Kinematic-to-Dynamic Transfer (Kin2Dyn), sample-efficient methods for reducing the sim2real gap. BDA and Kin2Dyn improve robot learning and generalization to the real-world by utilizing abstracted physics and simple adaptation models learned from small amounts of real-world data. 3) IndoorSim-to-OutdoorReal, an end-to-end learned approach that enables visual navigation in out-of-distribution environments zero-shot. We show that simulators can be used for real-world transfer without having to apriori design and model the deployment scenario. 4) Implicit Map Cross Modal Attention, a vision and language navigation model that utilizes structured implicit maps for navigating in an environment over time. Structured memory representation and training paradigms enable navigation for robots that occupy the same environment for long periods of time.
Towards Fine-grained Multi-Attribute Control using Language Models

(Georgia Institute of Technology, 2024-04-27) Baheti, Ashutosh

As we increasingly rely on powerful language models, ensuring their safe and effective operation necessitates extensive research in controllable text generation. Existing state-of-the-art language models struggle to generate the most accurate or desired output at the first attempt. Inspired by recent developments in self-correction in large language models and new reinforcement learning methods, we aim to train smaller language models as fine-grained editors, whereby they iteratively edit outputs to satisfy threshold constraints over multiple classifier-based attributes. In this thesis, I show a study of contextual offensive behavior of pretrained large language models and curate a high-quality dataset for toxicity detection. Next, I introduce a novel offline RL algorithm that can utilize arbitrary numeric scores as rewards during training to optimize any user-desired LM behavior by filtering out suboptimal data. Finally, I designed an offline RL framework, I propose a fine-grained multi-attribute controllability task, where the goal is to guide the language model to generate output sequences that satisfy user-defined threshold-based attribute constraints. The LM model can take multiple edits to reach the desired attributes. Experiments on both languages and proteins demonstrate the versatility and effectiveness of our approach.
Designing Collective Action Systems for User Privacy

(Georgia Institute of Technology, 2024-04-27) Wu, Yuxi

People feel concerned, angry, and frustrated when subjected to data breaches, surveillance, and other privacy-violating experiences with large institutions. However, they also feel helpless to effect change. Collective action may empower groups of people affected by such experiences to jointly voice their stories of lived harm and demand redress. In this thesis, I show that considering users’ privacy concerns and lived harms on a collective level can empower users through allowing them to (1) understand they are not alone in their experiences; (2) recognize that their harms are significant and measurable; and (3) be equipped with the appropriate tools to regularly speak out about these harms. I do this through a series of work in which I create a unified collective voice of privacy concerns, interpret the unified voice in existing legal lenses of harm, and imagine formal ways to measure and respond to privacy harms. Reflecting upon my findings from this work, I discuss how the current lack of a collective action framing within the usable privacy and security field has led to the community not addressing multiple long-standing problems, and how my work can inform future directions of research in the field.
“I Can Help with That” - Designing Flexible, Personalized and Proactive Conversational Agents for Older Adults

(Georgia Institute of Technology, 2024-04-27) Zubatiy, Tamara

Observing how older adults interact with commercial conversational agents reveals limitations in existing CA systems and contributes to design guidelines for future systems to overcome these limitations. Despite widespread adoption and study of pre-large-language model conversational agents (CAs), they are still error prone and frustrating due to lack of proactivity, personalization over time and inflexibility of input. Importantly, while they stand to greatly benefit older adults, through on-demand support that doesn’t require interacting with a touch screen, limited research has been done to explore how older adults use CAs over time at home. To contribute to this opportunity at the intersection of technology and health, my research combines user insights gleaned from multiple longitudinal deployments of commercial CAs into the homes of older adults, a battery of qualitative & quantitative statistical analyses, and an understanding of optimal training approaches designed specifically for older adults experiencing age related decline. My research contextualizes the usage patterns & frustrations of older adults into existing literature on CA usage by other populations, reveals clear impacts of cognitive status on CA usage by older adults and points to the power of training to increase CA interaction rigor. It also highlights three specific limitations in existing commercial CAs and points to tradeoffs that future systems, likely powered in part by large language models, will need to address in order to overcome them. This work is interwoven with a privacy-focused component including privacy impact assessments (PIA) two existing and one developing CA systems. Each PIA weighs the impact of these three tradeoffs on future CAs and provides actionable recommendations for privacy preserving future systems that still deliver on the personalized and proactive promise of future CAs.
Investigation of Wellbeing and Situation Awareness in Virtual Reality

(Georgia Institute of Technology, 2024-04-27) Fereydooni, Nadia

As our relationship with technology continues to deepen, wellbeing considerations need to be front and center. Failure to do so has real consequences, both on societal and regulatory fronts. While the relationship between wellbeing and common digital technologies is extensively studied, there's a notable gap in research concerning Virtual Reality (VR), despite its known effects on user wellbeing. Although VR devices are designed with safety, productivity, and enjoyment in mind, assuming that these aspects naturally lead to higher wellbeing and happier lives is inadequate. This becomes more urgent as people spend more time in virtual environments, especially with the growing adoption of VR devices in various contexts like home and office settings. However, potential harms, intentional or unintentional, cannot be overlooked. These harms include users being distressed due to not being aware of the real world while immersed in the virtual. Providing users with real-world information in the virtual environment is proposed as a solution, but the impact of this information on user experience remains unknown. This thesis investigates how providing real-world information affects users' virtual experience and behavior, addressing a significant gap in our understanding within the VR field.
Designing for the in-between: balancing organizational goals with worker wants and needs

(Georgia Institute of Technology, 2024-04-26) Sheehan, Alyssa

Rapid advances in computing are transforming the social and human dimensions of frontline work, from back-office process-driven systems to specialized hardware for monitoring and sensing. In blue collar domains these types of data driven systems are augmenting labor in entirely new ways making it imperative to understand how technology is transforming workplace practices far afield from the office environments historically at the center of HCI and design. For my dissertation, I conducted a cross site comparative analysis of three completed case studies in blue collar domains that examined how emerging data driven technologies are changing the nature of work. In each case study I applied participatory- and human- centered design approaches to develop and deploy worker interventions in the field. Outcomes of my comparative analysis highlight what it means to be a modern blue-collar worker upending traditional stereotypes of technological resistance and tracing how workers and organizations respond to the production of new types of data resulting from new emerging data driven technologies. The assumption that data is free and available for organizations to leverage is fundamentally redefining an entire class of labor removing the legitimacy of performing manual work. Frontline workers are being positioned as data producers upending notions of expertise and shifting task jurisdictions. This aligns with valorized principles of computing that manual labor is no longer necessary nor desirable yet there are whole categories of society where personal and professional meaning are derived from manual labor. It is these assumptions that have led to mismatched technology interventions that continue to perpetuate social injustices influencing our perception of who and what actions are deemed valuable. By articulating the tensions between worker wants and needs and organizational goals I seek to advocate for workers and create space for new methods and approaches to worker centered design and data activism in blue collar domains.
Less is More: Accelerating Vision by Eliminating Redundancy

(Georgia Institute of Technology, 2024-04-24) Bolya, Daniel

The key to modern machine learning is scale. With more data, bigger models, and more compute, as a community, we've found that the problems once deemed impossible for a computer to solve have rapidly become attainable---many even becoming easy with today's techniques. But as the scale of modern machine learning has ballooned, so too has its cost. Large transformer models, for instance, can require multiple hundreds of GPUs to train effectively and can be similarly unwieldy to deploy. In this dissertation, I aim to reduce those costs. Specifically, this work focuses on Vision Transformers (ViTs), which have been the dominant driving force in scaling machine learning for computer vision. Over the course of this dissertation, I show that these ViTs perform redundant computation, and that by exploiting these redundancies, we can greatly increase the efficiency of these systems, both during training and inference. In Part 1, I show that we can reduce the amount of spatial computation required by these transformers without losing performance. In Part 2, I show that certain architectural components are redundant and can be removed or greatly simplified. In Part 3, I show how we can exploit redundant features within models to speed them up and to circumvent training. Finally, in Part 4, I show that these speed-ups can compound on each other, resulting in a much faster model.
Towards Reliable Computer Vision Systems

(Georgia Institute of Technology, 2024-02-28) Prabhu, Viraj Uday

The real world has infinite visual variation – across viewpoints, time, space, and curation. As deep visual models become ubiquitous in high-stakes applications, their ability to generalize across such variation becomes increasingly important. Such generalization will alleviate the need to label a large corpus for every new deployment, which may be infeasible due to data volume (e.g. autonomous driving) or labeling cost (e.g. medical diagnosis). Further, it is necessary to overcome the natural spatiotemporal distribution shifts that a deployed model will invariably face (e.g. changing geographies and seasons). Finally, such generalization will unlock the possibility of knowledge transfer from inexpensive sources of data (e.g. transferring models trained in simulation to reality). In this thesis, I will present opportunities to improve such generalization at different stages of the ML lifecycle. First, I will discuss proactive strategies for training robust models by leveraging simulation to augment the long tail of real training data. Next, I will present reactive strategies to recover from unforeseen distribution shifts via self-supervised domain adaptation. Finally, I will present a framework to stress-test the robustness of vision models by leveraging foundation models for text and image synthesis to generate challenging counterfactual test cases.
Centering Care in the Design of AI in Global Health

(Georgia Institute of Technology, 2023-05-02) Ismail, Azra

There has been growing interest in the application of Artificial Intelligence (AI) in healthcare, motivated by scarce and unequal resources globally. These technologies promise improved health outcomes but they also risk increasing health inequities, particularly for communities and care workers on the margins. This dissertation engages in the study and design of AI systems in resource-constrained health settings. I focus on the context of maternal and child care delivery in India—which on the one hand has become synonymous with the global development goals of health equity and gender equality, while on the other, reflects a history of violating reproductive rights and relying on exploited care workers in the Global South. I propose that a focus on care can bring attention to and help resolve some of these tensions, and support the ethical application of AI in health settings. Synthesizing prior AI efforts in global health and my own ethnographic research on frontline health in an underserved setting in India, my dissertation first outlines the gaps and opportunities in current AI efforts. I then examine AI integration from a feminist lens of care in three areas—with existing data flows and practices, multi-stakeholder care ecologies, and everyday care work. First, I trace the movement of data from the site of collection from communities to use in machine learning (ML) systems, highlighting the caring labor that must go into making data “good” for ML, particularly by already overworked and underpaid health workers. Second, I consider the implementation of ML in a real-world setting, highlighting how care shapes the configuration of a human-AI system and the alignment of program goals for design and implementation, through continual dialogue across multiple diverse stakeholders. Third, I engage in the co-design of a conversational agent that aims to support care work while centering worker agency, drawing on a deep understanding of existing digital practices and their increasing work burden. Across these three areas, I pay attention to the gendered nature of work and technology use, and the broader care ecology within which these technologies are embedded. Finally, I reflect on my own position and orientation to care as a researcher engaging with workers and communities on the margins, using various methods and points of entry. Through a reflective and participatory approach, my research contributes an understanding of how AI-based technologies may (and where they may not) enable healthier and more caring futures for communities globally.
Kitchen science investigators: promoting identity development as scientific reasoners and thinkers

(Georgia Institute of Technology, 2010-08-30) Clegg, Tamara Lynnette

My research centers upon designing transformative learning environments and supporting technologies. Kitchen Science Investigators (KSI) is an out-of-school transformative learning environment we designed to help young people learn science through cooking. My dissertation considers the question, 'How can we design a learning environment in which children discover the utility of science in their lives and their own scientific capabilities?' I have explored this question in the context of designing and enacting KSI. We designed the environment (i.e., activities, facilitation, and technology support) so that in the midst of cooking, participants generate personal goals that they need science to achieve. Our design integrates software to promote scientific practices in a real world context. In my thesis research I analyze how learners are developing identity as scientific reasoners in this environment. I also make recommendations about the design of learning environments and technologies to help with scientific development. My dissertation study is a longitudinal study of individuals in our most recent implementation of KSI. My current analysis of KSI shows significant development of disposition and identity development among focal learners, as well as a set of causal factors. I found that as learners connected cooking and science, and as they participated in science socially with their friends, they began to increase their scientific participation in and outside of KSI. My findings suggest guidelines for software support, facilitation, and activities for getting learners engaged in scientific inquiry in ways that promote the development of scientific identities.