Organizational Unit:

School of Interactive Computing

Permanent Link

https://hdl.handle.net/1853/70783

Parent Organization

Organizational Unit

College of Computing

ArchiveSpace Name Record

https://finding-aids.library.gatech.edu/agents/corporate_entities/1113

Full item page

Publication Search Results

Now showing 1 - 10 of 313

An Introduction to Healthcare AI

(Georgia Institute of Technology, 2024-02-22) Braunstein, Mark

Healthcare and AI have an intertwined history dating back at least to the 1960's when the first 'cognitive chatbot' acting as a psychotherapist was introduced at MIT. Today, of course, there is enormous interest in and excitement about the potential roles of the latest AI technologies in patient care. There is a parallel concern about the risks. Will human physicians be replaced by intelligent agents? How might such agents benefit patient care short of that? What role will they play for patients. We'll explore this in a far-ranging talk that includes a number of real-world examples of how AI technologies are already being deployed to hopefully benefit those physicians and their patients.
Realistic Mobile Manipulation Tasks for Evaluating Home-Assistant Robots

(Georgia Institute of Technology, 2023-12-14) Yenamandra, Sriram Venkata

By assisting in household chores, robotic home assistants hold the potential to significantly enhance the quality of human lives. Mobile manipulation tasks can serve as test beds for evaluating the capabilities essential to the development of robotic home assistants: perception, language understanding, navigation, manipulation, and common-sense reasoning. However, it is imperative to use settings that closely resemble real-world deployment to ensure that progress made on these tasks is practically relevant. The thesis introduces three tasks, namely HomeRobot: Open Vocabulary Mobile Manipulation (OVMM), "GO To Any Thing" (GOAT) and Housekeep, to realsze the different dimensions of realism critical for evaluating embodied agents: 1) autonomy, the ability to operate without very specific instructions (e.g. the precise locations of goal objects), 2) exposure to realistic novel multi-room environments, 3) working with previously unseen objects, and 4) extended durations of deployment. Further, the thesis proposes baselines per task, which succeed in solving each task to a varying degree. The shortcomings of these baselines underscore the open challenges of open-vocabulary object detection and common-sense reasoning. By using test scenarios closer to real-world deployment, this work attempts to advance research in the development of robotic assistants.
Information Extraction on Scientific Literature under Limited Supervision

(Georgia Institute of Technology, 2023-12-12) Bai, Fan

The exponential growth of scientific literature presents both challenges and opportunities for researchers across various disciplines. Effectively extracting pertinent information from this extensive corpus is crucial for advancing knowledge, enhancing collaboration, and driving innovation. However, manual extraction is a laborious and time-consuming process, underscoring the demand for automated solutions. Information extraction (IE), a sub-field of natural language processing (NLP) focused on automatically extracting structured information from unstructured data sources, plays a crucial role in addressing this challenge. Despite their success, many IE methods often require substantial human-annotated data, which might not be easily accessible, particularly in specialized scientific domains. This highlights the need for adaptable and robust techniques capable of functioning with limited supervision. In this thesis, we study the task of information extraction on scientific literature, particularly addressing the challenge of limited (human) supervision. Specifically, our work has delved into four key dimensions of this problem. First, we explore the potential of harnessing easily accessible resources, like knowledge bases, to develop IE systems without direct human supervision. Second, we examine the use of pre-trained language models to create effective and efficient scientific IE systems, experimenting with various fine-tuning architectures and learning strategies. Next, we investigate the balance between the labor expenditure of human annotation and the computational cost linked with domain-specific pre-training, to achieve optimal performance under the budget constraints. Lastly, we capitalize on the emerging capabilities of large pre-trained language models by showcasing how information extraction can be achieved solely based on a human-crafted data schema. Through these explorations, this thesis aims to lay a solid foundation for the continued advancement of scientific IE under limited supervision.
Lifelong Machine Learning without Lifelong Data Retention

(Georgia Institute of Technology, 2023-12-10) Smith, James Seale

Machine learning models suffer from a phenomenon known as catastrophic forgetting when learning novel concepts from continuously shifting training data. Typical solutions for this continual learning problem require extensive replay of previously seen data, which increases memory costs and may violate data privacy. To address these challenges, we first explore replacing this replay data with alternatives: (i) unlabeled data “from the wild” and (ii) synthetic data generated via model inversion. Our work using this alternative replay data boasts strong performance on replay-free continual learning for image classification. Next, we consider an alternative solution to entirely replace replay data: pre-training. Specifically, we leverage strongly pre-trained models and continuously edit them with prompts and low-rank adapters for both (i) image classification and (ii) natural-language visual reasoning. Finally, we extend the idea of continual learning using pre-trained models to the proposed setting of continual customization of text-to-image diffusion models. We hope that our work on enabling models to learn from evolving data distributions and adapt to new tasks will help unlock the full potential of machine learning in addressing emerging real-world challenges.
Machine Learning for Agile Robotic Control

(Georgia Institute of Technology, 2023-12-06) Wagener, Nolan C.

Roboticists typically exploit structure in a problem, such as by modeling the mechanics of a system, to generate solutions for a given task. However, this structure can limit flexibility and require practitioners to reason about challenging phenomena, such as contacts in mechanics. Data, conversely, provides much more flexibility and, when combined with deep neural networks, has given rise to powerful models in vision and language, all with little hand-engineered structure. While it is tempting to fully forego structure in favor of learning-based methods for robotics, we show how data and learning can be gracefully incorporated in a structured way. In particular, we focus on the control setting, and we demonstrate that robotic control offers a variety of modes that data can be utilized. First, we show that data can be used in a model-based fashion to train a neural network that approximates complex dynamics and which can be used within a model predictive controller (MPC). Then, we show that the MPC process is itself an instance of online learning and demonstrate how to synthesize MPC algorithms from a common online learning algorithm. We apply both of the aforementioned approaches on a real-world aggressive driving task and show that they can accomplish the task. Next, we consider the safe reinforcement learning problem and show that safety interventions can be used as a learning signal to have an agent learn to become safe without needing to execute unsafe actions in the environment. Finally, we consider the simulated humanoid domain and show that pre-collected human motions can act as a strong inductive bias to ground motions learned by the humanoid agent.
Leveraging Low-Dimensional Geometry for Search and Ranking

(Georgia Institute of Technology, 2023-12-06) Fenu, Stefano

There is a substantial body of work on search and ranking in computer science, but less attention has been paid to the question of how to learn geometric data representations that are amenable to search and ranking tasks. Index-based datastructures for search are commonplace, but these discard structural features of the data, often have large memory profiles, and scale poorly with data dimension. Geometric search techniques do exist, but few analogous search datastructures or preprocessing algorithms exist that leverage spatial structure in data to increase search performance. The aim of the research detailed here is to show that leveraging low-dimensional geometry can improve the performance of search and ranking over index-only methods, and that there are dimensionality reduction techniques that can make spatial search algorithms more effective without any additional memory overhead. This work accomplishes these aims by developing methods for: Learning low-dimensional coordinate embeddings explicitly for the purpose of search and ranking; and actively querying and constructing searchable embeddings to minimize user-labeling costs. This dissertation will further provide scalable versions of these algorithms and demonstrate their effectiveness across a broad range of problem domains including visual, text, and educational data. These performance improvements will allow human-in-the-loop search of larger datasets and enable new applications in preference search and ranking.
Controllability and Uncertainty in Generative Models

(Georgia Institute of Technology, 2023-12-06) Ham, Cusuh

This dissertation describes methods for enhancing generative models with either added controllability or expressiveness of uncertainty, demonstrating how a strong prior enables both features. One general approach is to introduce new architectures or training objectives. However, current trends towards massive upscaling of model size, training data, and computational resources can make retraining or fine-tuning difficult and expensive. Thus, another approach is to build upon existing pre-trained models. We consider both types of approaches with an emphasis on the latter. We first tackle the tasks of controllable image synthesis and uncertainty estimation through training-based methods and then switch focus towards computationally-efficient methods that do not require direct updates to the base model's parameters. We conclude by discussing future directions based on the insights from our findings.
Robotics in the Era of Vision-Language Foundation Models

(Georgia Institute of Technology, 2023-11-29) Kira, Zsolt
Foundation Models for Robotics

(Georgia Institute of Technology, 2023-11-29) Garg, Animesh
Robotics Days for Industry 2023 Welcome and Overview

(Georgia Institute of Technology, 2023-11-29) Hutchinson, Seth