Organizational Unit:
School of Interactive Computing
School of Interactive Computing
Permanent Link
Research Organization Registry ID
Description
School established in 2007
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)
ArchiveSpace Name Record
Publication Search Results
Now showing
1 - 10 of 367
-
ItemAugmenting Visualizations with Statistical and User-Defined Data Facts(Georgia Institute of Technology, 2024-07-28) Guo, GraceWhen designing visualizations and visualization systems, we often augment charts and graphs with visual elements in order to convey richer and more nuanced information about relationships in the data. However, we do not fully understand user considerations when creating these augmentations, nor do we have toolkits to support augmentation authoring. This thesis first outlines a design space of user-created augmentations, then introduces Auteur, a front-end JavaScript toolkit designed to help developers add augmentations to web-based D3 visualizations and systems to convey statistical and custom data relationships. The library is then customized and extended for the domains of online learning and causal inference, where users may be interested in domain-specific data relationships or work with unique chart types and data sets. Collectively, these contributions aim to help us better incorporate user-defined augmentations into visualizations for analysis and storytelling, thus conveying human context, user preferences, and domain knowledge through our charts and graphs.
-
ItemLarge-Scale Offline Pre-Training Bootstraps Embodied Intelligence(Georgia Institute of Technology, 2024-07-27) Majumdar, ArjunA central goal in Artificial Intelligence (AI) is to develop embodied intelligence -- i.e., embodied agents such as mobile robots that can accomplish a wide variety of tasks in real-world, physical environments. In this dissertation, we will argue that offline pre-training of foundation models on web-scale data can bootstrap embodied intelligence. In part 1, we present VC-1, a visual foundation model pre-trained (primarily) on video data collected from an egocentric perspective. We systematically demonstrate that such models substantially benefit from pre-training dataset diversity by introducing CortexBench, an embodied AI (EAI) benchmark curated from a diverse collection of existing EAI tasks spanning locomotion, navigation, and dexterous or mobile manipulation. In part 2, we first demonstrate that visual grounding learned from internet data (i.e., image-caption pairs from the web) can be transferred to an instruction-following visual navigation agent (VLN-BERT). Then, we present ZSON, a highly scalable approach for learning to visually navigate to objects specified in open-vocabulary, natural language instructions such as “find the kitchen sink.” In part 3, we study spatial understanding in real-world indoor environments. First, we introduce an evaluation benchmark (OpenEQA) to measure progress on answering open-ended questions about 3D scenes. Then, we present a modular agent that leverages pre-trained components such as vision-language models (VLMs) to address the question-answering task.
-
ItemThe Algorithm Keeps The Score: Identity, Marginalization, and Power in the Technology-Mediated Search for Care(Georgia Institute of Technology, 2024-07-26) Pendse, Sachin R.Severe psychological distress and mental illness are widespread. Globally, one in every two people will experience a mental health disorder at some point over the course of their lifetime. Identity has long played a core role in how each of those individuals understands their distress, expresses it to others, and searches for care. Directly tied to identity, societal marginalization and power similarly play a core role in whether individuals in distress can successfully access the resources and care that could deliver them relief. Alongside identity, power, and marginalization, technologies increasingly play a role in how people engage with care --- people in distress may turn to mental health helplines, online support communities, large language model chatbots, and other accessible technologies as they make meaning from their distress and search for care. In turn, the design of those technologies also have an influence on people's illness experiences, just as identity, power, and marginalization do. I understand these support technologies to be technology-mediated mental health support (TMMHS) systems, in which technology mediates support provided to people in distress. This dissertation engages in the study of how identity, power, and marginalization intersect with the design of TMMHS systems to influence people's experiences of distress, as well as their subsequent engagements with care. Marginalized populations often have diverse and unmet mental health needs---I thus investigate how the design of TMMHS systems may be helpful in validating and meeting these marginalized needs, as well as how the design of TMMHS systems may further compound offline inequities, and make it more difficult for people to access acceptable and effective care. Through use of both quantitative and qualitative methods, I highlight the limitations associated with the psychiatric and computational approach that often decontextualizes and quantifies experiences of distress. I propose that one means to mitigate these limitations is to incorporate considerations of identity, power, and marginalization in the design of TMMHS systems, and argue that doing so could ensure that diverse people are able to to leverage technology for their mental health needs. I describe what these considerations may look like, including a focus on designing technologies that strengthen support relationships, an awareness of differences across people of diverse identities, and a constant eye to patterns of historical marginalization. My dissertation begins by first outlining the history of mental health support in both traditional and technology-mediated contexts, and discusses the role of colonial power relations in how both identity and mental illness have been understood, including for my areas of study in the United States and India. This provides important context for my empirical investigation of people's experiences with TMMHS systems. I then examine four areas in which identity, marginalization, and power have a direct and salient impact on how people engage with diverse TMMHS systems. First, I investigate use of Indian mental health helplines, analyzing how volunteers provide care to individuals in distress, and where (and for whom) technical and structural gaps together prevent that care from actually being accessed by callers in distress. I next shift to resource constrained areas of the United States, investigating how individuals in mental health professional shortage areas use TMMHS systems to fill structural gaps and create new identities from their experiences of distress, and the role of technical design and marginalization in what I find to often be deeply polarized environments within TMMHS systems. Building on this finding, I then examine a dimension of social identity that is particularly polarized in the U.S. today, or partisan identity. I quantitatively examine differentiated engagements among partisan users of online support communities, and investigate where there may be differences in potential avenues to care through analyzing personalized search engine results for U.S. Republican and Democrat partisan groups. I end by investigating identity-based biases within a new and emergent form of TMMHS support, or LLM-based chatbots, including a quantitative analysis of biases and qualitative analysis of lived experiences with this new tool. Across all, I use the language people use around their distress as a tool to analyze how identity, power, and marginalization interact with technical (and algorithm) design to influence people's lived experiences with their mental health. My research contributes a deeper understanding of the harms that are created when technology-mediated support is not considerate of histories of marginalization, and where support technologies can be sensitive to identity-based marginalization. In summary, with this dissertation, I ask the question --- what do we gain and lose when technology (and the algorithms that underlie it) keep the score around our experiences with mental health?
-
ItemScaling Online Reinforcement Learning In Embodied AI To 64K Steps(Georgia Institute of Technology, 2024-07-25) Elawady, Ahmad IbrahemIntelligent embodied agents need to quickly adapt to new scenarios by integrating long histories of experience into decision-making. For instance, a robot in an unfamiliar house initially wouldn't know the locations of objects needed for tasks and might perform inefficiently. However, as it gathers more experience, it should learn the layout of its environment and remember where objects are, allowing it to complete new tasks more efficiently. The current methods struggle with maintaining and utilizing long history consisting of thousands of observations. To enable such rapid adaptation to new tasks, we present ReLIC, a new approach for in-context reinforcement learning (RL) for embodied agents. With ReLIC, agents are capable of adapting to new environments using 64,000 steps of in-context experience with full attention mechanism while being trained through self-generated experience via RL. We achieve this by proposing a novel policy update scheme for on-policy RL called "partial updates'" as well as a Sink-KV mechanism which enables effective utilization of a long observation history for embodied agents. Our method outperforms a variety of meta-RL baselines in adapting to unseen houses in an embodied multi-object navigation task in a photorealistic simulation. In addition, we find that ReLIC is capable of few-shot imitation learning despite never being trained with expert demonstrations. We also provide a comprehensive analysis of ReLIC, highlighting that the combination of large-scale RL training, the proposed partial updates scheme, and the Sink-KV are essential for effective in-context learning.
-
ItemExploring Computing Tools by Modality and Materiality(Georgia Institute of Technology, 2024-07-25) Johnson, Michael J.As computer science education achieves wider recognition as a field central to success in this ever-growing technological world, the tools we have in place for teaching and learning CS deserve more scrutiny. CS educators are tasked with designing innovative curriculum, establishing their classroom environment, and gathering materials that provide an engaging and meaningful learning experience. One important consideration is the choice of computing tools students will interact with. Computing tools are materials designed to support learners in exploring computer science and developing CS expertise. These tools range from online code-learning platforms to maker programs to tangible devices, and can even include computer-independent materials. When an educator selects computing tools for students to work with, such as a video, a game, crafting materials, a computer, or even a pencil and paper, they influence the outcomes of how students learn, retain, and are evaluated on computational principles. How those influences occur depends upon a tool's modality—how the user interacts with the tool—and materiality—the material properties of the tool. Computing tools have the potential for many diverse interactions brought by their modalities and materialities, yet CS education research has given little consideration to these differences when assessing if a tool is useful in developing learners' CS expertise. The work presented in this defense explores using computing tools in two informal learning environments for high school students: CWP 2.0 and BridgeUP STEM. I theorize that isolating and comparing these properties will yield key information on how each tool mediates relationships between learners, their objectives, and other actors in the learning environment. A deeper understanding of these relationships will contribute to more effective uses of computing tools in CS education.
-
ItemHarnessing Synthetic Data for Robust and Reliable Vision(Georgia Institute of Technology, 2024-07-16) Chattopadhyay, PrithvijitProgress in computer vision has been driven by models trained on large amounts of exemplar data for different tasks. These exemplar data sources intend to capture task-specific information and instance-level variations that a trained model will likely encounter in the wild. However, for conditions where curating lots of labeled real-world data is prohibitively expensive, synthetic data can serve as a cost-effective alternative. Synthetic data sources offer a few key benefits: fast access to labeled task-specific data at scale, labels across varying task complexities, and curation of labeled data across diverse conditions in a controlled manner. This thesis demonstrates how “controlled variations” in synthetic data can be used to develop robust and reliable vision models. Controlled variations refer to intentional, systematic modifications to synthetic data, designed to either explore specific aspects of model behavior or improve model transfer across distributions. In Chapters 3 and 4, we discuss applying controlled variations internally at the data-engine (simulator) stage to create diverse instances to systematically investigate the robustness of trained vision models. In Chapters 5 and 6, we discuss how applying controlled variations externally as perturbations or data augmentations (to intermediate features or input images) can enable model transfer across changing visual distributions. Finally, in Chapter 7, we discuss how controlled variations applied externally on synthetic data can ensure reliability of predictions made on real data distributions. We conclude by summarizing takeaways and outlining potential future research directions.
-
ItemCIRCULAR INTERACTIVE MATERIAL:Making Ubiquitous Computing More Scalable and Sustainable(Georgia Institute of Technology, 2024-07-08) Cheng, TingyuWeiser has predicted the third generation of computing would result in individuals interacting with many computing devices and ultimately can “weave themselves into the fabric of everyday life until they are indistinguishable from it”. However, how to achieve this seamlessness and what associated interaction should be developed are still under investigation. On the other hand, for achieving a fully immersive intelligent environment, we might produce trillions of smart devices, but their current configuration (e.g., plastic housing, PCB-board) will inevitably increase environment burden. In my research, I work on creating computational materials with different encoded material properties such as conductivity, transparency or water-solubility that can be seamlessly integrated into our living environment to enrich different modalities of information communication. Meanwhile, this material intelligence will also affect devices' usefulness and life expectancy from a sustainability perspective. This thesis contains five works to scope the future pervasiveness of IoT devices, and meanwhile paying attention to their entire device life cycle. They emphasize different aspects that are crucial to construct the circular interactive material embedded environment by balancing the tension between scalability and sustainability. Silver Tape is a simple fabrication technique leveraging the inkjet printing circuits to transfer silver traces onto everyday surfaces without any post-treatment. This method allows users to quickly fabricate versatile sensors by leveraging the intrinsic material property and meanwhile the transferred sensors can be repaired when damaged. Duco is the second project that negates the need for any human intervention by leveraging a hanging robotic system that automatically sketches large-scale circuity. We have explored not only how to incorporate these computational abilities into our living structures such as walls, but also created erasable ink that allows users to erase the circuitry and embed the surface with new capabilities to make the walls reusable. PITAS is a thin-sheet robotic material composed of a reversible phase transition actuating layer and a heating/sensing layer to create shape-changing devices that can locally or remotely convey physical information such as shape, color, texture and temperature changes. This project achieved a distinctive renewal process by immersing the material-actuator in ethanol, allowing the devices with new life. Then, the next project Functional Destruction aims to further promote sustainability by designing devices that self-destruct once they have fulfilled their purpose. The last project, called Recy-ctronics, is to extend the idea from Functional Destruction by developing fully recyclable circuits, by not only treating the physical disintegration as the end of a device's life but the beginning point of a device's new lifespan. This work also extends beyond traditional thin-sheet electronics, introducing three distinct form factors: sheets, foam, and tubes.
-
ItemDesigning Responsive Environments to Support Speech Perception for Individuals with Mild Cognitive Impairment(Georgia Institute of Technology, 2024-07-01) Feustel, Clayton E.Older adults with mild cognitive (MCI) face significant challenges perceiving the speech of others. The factors that cause individuals with MCI to have greater difficulty perceiving speech are multi-faceted and include the loss of hearing due to aging as well as changes to cognitive systems that impact how speech is processed. These cognitive changes make certain environments more difficult to hear and understand others, particularly locations with significant levels of background noise. While modern wearable solutions (i.e., hearing aids) can improve communication experience, they fail to adequately perform in adverse acoustic environment. By looking to the built environment, instead of the individual, we can extend the classical ubiquitous computing approach to improving speech perception into a new domain by designing a responsive environment that changes the acoustic properties of space according to the physical properties of the room and the activities of the occupants. Using the body of knowledge, methods, and research approaches from architectural acoustics and ubiquitous computing I strengthen the emerging research on the unique difficulties that older adults with MCI have in complex acoustic environments with competing speech and prototype a novel responsive environment that changes the acoustic properties of space to support speech perception. Through interviews with members of the program, surveys with program staff, and audiological evaluations I show that while individuals with MCI may be able to achieve similar levels of speech perception thresholds as their cognitively healthy peers, they must dedicate significantly more cognitive resources to reach similar performance in environments with competing speech where high levels of informational masking exist. Using the knowledge from this work, I design and deploy a novel sensor that can continuously and accurately take live measurements of the relevant acoustic properties. I continue my mixed methods approach by conducting an in-situ evaluation of a responsive acoustic environment in a real-world therapeutic facility with individuals with MCI to validate the effectiveness of dynamically changing the level of sound masking based on the presence of competing speech. The results highlight the opportunity for environmentally-targeted approaches to improve speech perception for older adults with MCI, as well as the drawbacks that future work in this space will need to address when designing for individuals when utilizing unique differentiated reactions to informational masking.
-
ItemGenerating Legible Robot Motion From Classifier Guided Diffusion Policies(Georgia Institute of Technology, 2024-05-15) Bronars, MatthewIn human-robot collaboration, legible motion that conveys a robot’s intentions and goals is known to improve safety, task efficiency, and user experience. Legible robot motion is typically generated using hand-designed cost functions and classical motion planners. However, with the rise of deep learning and data-driven robot policies, we need methods for training end-to-end on offline demonstration data. In this paper, we propose Legibility Diffuser, a diffusion-based policy that learns intent expressive motion directly from human demonstrations. By variably combining the noise predictions from a goal-conditioned diffusion model, we guide the robot’s motion toward the most legible trajectory in the training dataset. We find that decaying the guidance weight over the course of the trajectory is critical for maintaining a high success rate while maximizing legibility.
-
ItemSim2Robot: Training Robots for the Real-World with Imperfect Simulators(Georgia Institute of Technology, 2024-04-29) Truong, JoanneThe goal of Artificial Intelligence is to “construct useful intelligent systems”, such as mobile robots, to assist in our day-to-day lives. For these mobile robotic assistants to be useful in the real-world, they must skillfully navigate complex environments (e.g., delivering packages from one building to another). However, training robots in the real-world can be slow, dangerous, expensive, and difficult to reproduce. Thus, one paradigm in robot learning is to leverage simulation for training robots (where gathering experience is scalable, safe, cheap, and reproducible) before being deployed in the real world. However, no simulator is perfect; AI systems learn to “cheat” by exploiting imperfections. Thus, how can we train robots in imperfect simulators while ensuring that the learned skills generalize to reality? In this thesis, we will argue that simulators need not be perfect to be useful; they don’t need to model everything about the world, only what’s necessary for generalization. We present 1) Sim2Real Correlation Coefficient for measuring and optimizing performance correlation between simulation and reality, enabling confident evaluation. 2) Bi-directional Domain Adaptation (BDA), and Kinematic-to-Dynamic Transfer (Kin2Dyn), sample-efficient methods for reducing the sim2real gap. BDA and Kin2Dyn improve robot learning and generalization to the real-world by utilizing abstracted physics and simple adaptation models learned from small amounts of real-world data. 3) IndoorSim-to-OutdoorReal, an end-to-end learned approach that enables visual navigation in out-of-distribution environments zero-shot. We show that simulators can be used for real-world transfer without having to apriori design and model the deployment scenario. 4) Implicit Map Cross Modal Attention, a vision and language navigation model that utilizes structured implicit maps for navigating in an environment over time. Structured memory representation and training paradigms enable navigation for robots that occupy the same environment for long periods of time.