Organizational Unit:

School of Interactive Computing

Permanent Link

https://hdl.handle.net/1853/70783

Parent Organization

Organizational Unit

College of Computing

ArchiveSpace Name Record

https://finding-aids.library.gatech.edu/agents/corporate_entities/1113

Full item page

Publication Search Results

Now showing 1 - 7 of 7

Realistic Mobile Manipulation Tasks for Evaluating Home-Assistant Robots

(Georgia Institute of Technology, 2023-12-14) Yenamandra, Sriram Venkata

By assisting in household chores, robotic home assistants hold the potential to significantly enhance the quality of human lives. Mobile manipulation tasks can serve as test beds for evaluating the capabilities essential to the development of robotic home assistants: perception, language understanding, navigation, manipulation, and common-sense reasoning. However, it is imperative to use settings that closely resemble real-world deployment to ensure that progress made on these tasks is practically relevant. The thesis introduces three tasks, namely HomeRobot: Open Vocabulary Mobile Manipulation (OVMM), "GO To Any Thing" (GOAT) and Housekeep, to realsze the different dimensions of realism critical for evaluating embodied agents: 1) autonomy, the ability to operate without very specific instructions (e.g. the precise locations of goal objects), 2) exposure to realistic novel multi-room environments, 3) working with previously unseen objects, and 4) extended durations of deployment. Further, the thesis proposes baselines per task, which succeed in solving each task to a varying degree. The shortcomings of these baselines underscore the open challenges of open-vocabulary object detection and common-sense reasoning. By using test scenarios closer to real-world deployment, this work attempts to advance research in the development of robotic assistants.
Computable Phenotype and Active Learning for Acute Respiratory Distress Syndrome Classification

(Georgia Institute of Technology, 2023-07-31) Pathak, Ashwin

The accurate detection of acute respiratory distress syndrome (ARDS) is crucial in the Intensive Care Unit (ICU) due to its severe impact on organ function and high mortality and morbidity rates among critically ill patients. ARDS has various causes, with infection and trauma being the most common. It is characterized by poor oxygenation despite mechanical ventilation. The Berlin criteria are currently used as the gold standard for identifying ARDS, but manual adjudication of chest radiographs limits automation. Since Electronic Medical Records (EMRs) do not typically provide bilateral infiltrate information, an automated approach to detect radiological evidence would facilitate comprehensive study of the syndrome, eliminating the need for costly individual image inspections by physicians. Natural Language Processing (NLP) offers an opportunity to analyze radiology notes and determine lung status through the text. In this study, an NLP pipeline was developed to analyze radiology notes of 362 sepsis-3 criteria-fulfilling patients from the EMR, aiming to diagnose possible ARDS. After denoising and preprocessing the notes, they were vectorized using BERT word embeddings and fed into a classification layer via transfer learning. The resulting classification models achieved F1-scores of 74.5\% and 64.22\% for the Emory and Grady datasets, respectively. While large language models demonstrate excellent performance in ARDS detection, they typically require a substantial amount of training data. Active learning methods have the potential to minimize data requirements but may not consistently achieve the desired performance level. Therefore, this study thoroughly evaluates different active learning query strategies within a human-in-the-loop scenario to reduce the burden of manual adjudication. Additionally, active learning methods do not indicate when the performance target has been reached, and evaluation is challenging without a separate held-out validation dataset. Thus, the study explores the benefits of employing stopping criteria to recommend when to terminate the active learning process and assesses their effectiveness. The proposed active learning pipeline aims to continuously enhance the model's performance, resulting in an improved F1-score of 61.26% compared to random sampling baselines (59.96%), demonstrating the effectiveness of active learning methods in an imbalanced data setting.
Navigating to Objects: Simulation, Data, and Models

(Georgia Institute of Technology, 2023-05-03) Ramrakhya, Ram

General-purpose robots that can perform a diverse set of embodied tasks in a diverse set of environments have to be good at visual exploration. Consider the canonical example of asking a household robot, ‘Where are my keys?’. To answer this (assuming the robot does not remember the answer from memory), the robot would have to search the house, often guided by intelligent priors – e.g. peeking into the washroom or kitchen might be sufficient to be reasonably sure the keys are not there, while exhaustively searching the living room might be much more important since keys are more likely to be there. While doing so, the robot has to internally keep track of where all it has been to avoid redundant search, and it might also have to interact with objects, e.g. check drawers and cabinets in the living room (but not those in the washroom or kitchen!). This example illustrates fairly sophisticated exploration, involving a careful interplay of various implicit objectives (semantic priors, exhaustive search, efficient navigation, interaction, etc.) which are hard to learn using Reinforcement Learning (RL). In this thesis, we focus on learning such embodied object-search strategies from human demonstrations which implicitly captures intelligent behavior we wish to impart to our agents. In Part I, we present a large-scale study of imitating human demonstrations on tasks that require a virtual robot to search for objects in new environments – (1) ObjectGoal Navigation (e.g. ‘find & go to a chair’) and (2) PICK&PLACE (e.g. ‘find mug, pick mug, find counter, place mug on counter’). In Part 2, we extend our focus to improving agents trained using human demonstrations in a tractable way. Towards this, we present PIRLNav, a two-stage learning scheme for BC pretraining on human demonstrations followed by RL-finetuning. Finally, using this BC→RL training recipe, we present a rigorous empirical analysis where we investigate whether human demonstrations can be replaced with ‘free’ (automatically generated) sources of demonstrations, e.g. shortest paths (SP) or task-agnostic frontier exploration (FE) trajectories.
Mitigating Racial Biases in Toxic Language Detection

(Georgia Institute of Technology, 2022-05-05) Halevy, Matan

Recent research has demonstrated how racial biases against users who write African American English exists in popular toxic language datasets. While previous work has focused on a single fairness criteria, we propose to use additional descriptive fairness metrics to better understand the source of these biases. We demonstrate that different benchmark classifiers, as well as two in-process bias-remediation techniques, propagate racial biases even in a larger corpus. We then propose a novel ensemble-framework that uses a specialized classifier that is fine-tuned to the African American English dialect. We show that our proposed framework substantially reduces the racial biases that the model learns from these datasets. We demonstrate how the ensemble framework improves fairness metrics across all sample datasets with minimal impact on the classification performance, and provide empirical evidence to its ability to unlearn the annotation biases towards authors who use African American English. ** Please note that this work may contain examples of offensive words and phrases.
Virtual Reality as a Stepping Stone to Real-World Robotic Caregiving

(Georgia Institute of Technology, 2021-05-04) Gu, Yijun

Versatile robotic caregivers could benefit millions of people worldwide, including older adults and people with disabilities. Recent work has explored how robotic caregivers can learn to interact with people through physics simulations, yet transferring what has been learned to real robots remains challenging. By bringing real people into the robot's virtual world, virtual reality (VR) has the potential to help bridge the gap between simulations and the real world. In this thesis, we present Assistive VR Gym (AVR Gym), which enables real people to interact with virtual assistive robots. We also provide evidence that AVR Gym can help researchers improve the performance of simulation-trained assistive robots with real people. Prior to AVR Gym, we trained robot control policies (\emph{Original Policies}) solely in simulation for four robotic caregiving tasks (robot-assisted feeding, drinking, itch scratching, and bed bathing) with two simulated robots (PR2 from Willow Garage and Jaco from Kinova). With AVR Gym, we developed \emph{Revised Policies} based on insights gained from testing the Original policies with real people. Through a formal study with eight participants in AVR Gym, we found that the Original policies performed poorly, the Revised policies performed significantly better, and that improvements to the biomechanical models used to train the Revised policies resulted in simulated people that better match real participants. Notably, participants significantly disagreed that the Original policies were successful at assistance, but significantly agreed that the Revised policies were successful at assistance. Overall, our results suggest that VR can be used to improve the performance of simulation-trained control policies with real people without putting people at risk, thereby serving as a valuable stepping stone to real robotic assistance.
Code-Upload AI Challenges on EvalAI

(Georgia Institute of Technology, 2021-05-04) Jain, Rishabh

Artificial intelligence develops techniques and systems whose performance must be evaluated on a regular basis in order to certify and foster progress in the discipline. We have developed several tools such as EvalAI which helps us in evaluating the performance of these systems and to push the frontiers of machine learning and artificial intelligence. Initially, the AI community focussed on simple and traditional methods of evaluating these systems in the form of prediction upload challenges but with the advent of deep learning, larger datasets, and complex AI agents, etc. these methods are not sufficient for evaluation. A technique to evaluate these AI agents is by uploading their code, running it on the sequestered test dataset, and reporting the results on the leaderboard. In this work, we introduced code upload evaluation of AI agents on EvalAI for all kinds of AI tasks, i.e.reinforcement learning, supervised learning, and unsupervised learning. We offer features such as scalable backend, prioritized submission evaluation, secure test environment, and running AI agents code in an isolated sanitized environment. The end-to-end pipeline is extremely flexible, modular, and portable which can later be extended to multi-agents setups and evaluation on dynamic datasets. We also proposed a procedure using GitHub for AI challenge creation to version, maintain, and reduce the friction in this conglomerate process. Finally, we focused on providing analytics to all the users of the platform along with easing the hosting of EvalAI on private servers as an internal evaluation platform.
Search-based collision-free motion planning for robotic sculpting

(Georgia Institute of Technology, 2020-07-28) Jain, Abhinav

In this work, I explore the task of robot sculpting. I propose a search-based planning algorithm to solve the problem of sculpting by material removal with a multi-axis manipulator. I generate collision free trajectories for a manipulator using best-first search in two different material representations – a voxel representations and a subdivision surface representation. I also show significant speedup of the algorithm in the voxel representation by using octrees to decompose the voxel space. I demonstrate the algorithm on a multi-axis manipulator in simulation and on a physical robot by sculpting Michelangelo’s Statue of David.

Organizational Unit:

School of Interactive Computing

Permanent Link

Research Organization Registry ID

Description

Previous Names

Parent Organization

Parent Organization

Includes Organization(s)

ArchiveSpace Name Record

Filters

Author

Advisor

Date

Organization

Series

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Georgia Tech Library

Organizational Unit: School of Interactive Computing

Permanent Link

Research Organization Registry ID

Description

Previous Names

Parent Organization

Parent Organization

Includes Organization(s)

ArchiveSpace Name Record

Filters

Author

Advisor

Date

Organization

Series

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Organizational Unit:

School of Interactive Computing