Organizational Unit:
School of Interactive Computing

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 3 of 3
  • Item
    Realistic Mobile Manipulation Tasks for Evaluating Home-Assistant Robots
    (Georgia Institute of Technology, 2023-12-14) Yenamandra, Sriram Venkata
    By assisting in household chores, robotic home assistants hold the potential to significantly enhance the quality of human lives. Mobile manipulation tasks can serve as test beds for evaluating the capabilities essential to the development of robotic home assistants: perception, language understanding, navigation, manipulation, and common-sense reasoning. However, it is imperative to use settings that closely resemble real-world deployment to ensure that progress made on these tasks is practically relevant. The thesis introduces three tasks, namely HomeRobot: Open Vocabulary Mobile Manipulation (OVMM), "GO To Any Thing" (GOAT) and Housekeep, to realsze the different dimensions of realism critical for evaluating embodied agents: 1) autonomy, the ability to operate without very specific instructions (e.g. the precise locations of goal objects), 2) exposure to realistic novel multi-room environments, 3) working with previously unseen objects, and 4) extended durations of deployment. Further, the thesis proposes baselines per task, which succeed in solving each task to a varying degree. The shortcomings of these baselines underscore the open challenges of open-vocabulary object detection and common-sense reasoning. By using test scenarios closer to real-world deployment, this work attempts to advance research in the development of robotic assistants.
  • Item
    Navigating to Objects: Simulation, Data, and Models
    (Georgia Institute of Technology, 2023-05-03) Ramrakhya, Ram
    General-purpose robots that can perform a diverse set of embodied tasks in a diverse set of environments have to be good at visual exploration. Consider the canonical example of asking a household robot, ‘Where are my keys?’. To answer this (assuming the robot does not remember the answer from memory), the robot would have to search the house, often guided by intelligent priors – e.g. peeking into the washroom or kitchen might be sufficient to be reasonably sure the keys are not there, while exhaustively searching the living room might be much more important since keys are more likely to be there. While doing so, the robot has to internally keep track of where all it has been to avoid redundant search, and it might also have to interact with objects, e.g. check drawers and cabinets in the living room (but not those in the washroom or kitchen!). This example illustrates fairly sophisticated exploration, involving a careful interplay of various implicit objectives (semantic priors, exhaustive search, efficient navigation, interaction, etc.) which are hard to learn using Reinforcement Learning (RL). In this thesis, we focus on learning such embodied object-search strategies from human demonstrations which implicitly captures intelligent behavior we wish to impart to our agents. In Part I, we present a large-scale study of imitating human demonstrations on tasks that require a virtual robot to search for objects in new environments – (1) ObjectGoal Navigation (e.g. ‘find & go to a chair’) and (2) PICK&PLACE (e.g. ‘find mug, pick mug, find counter, place mug on counter’). In Part 2, we extend our focus to improving agents trained using human demonstrations in a tractable way. Towards this, we present PIRLNav, a two-stage learning scheme for BC pretraining on human demonstrations followed by RL-finetuning. Finally, using this BC→RL training recipe, we present a rigorous empirical analysis where we investigate whether human demonstrations can be replaced with ‘free’ (automatically generated) sources of demonstrations, e.g. shortest paths (SP) or task-agnostic frontier exploration (FE) trajectories.
  • Item
    Code-Upload AI Challenges on EvalAI
    (Georgia Institute of Technology, 2021-05-04) Jain, Rishabh
    Artificial intelligence develops techniques and systems whose performance must be evaluated on a regular basis in order to certify and foster progress in the discipline. We have developed several tools such as EvalAI which helps us in evaluating the performance of these systems and to push the frontiers of machine learning and artificial intelligence. Initially, the AI community focussed on simple and traditional methods of evaluating these systems in the form of prediction upload challenges but with the advent of deep learning, larger datasets, and complex AI agents, etc. these methods are not sufficient for evaluation. A technique to evaluate these AI agents is by uploading their code, running it on the sequestered test dataset, and reporting the results on the leaderboard. In this work, we introduced code upload evaluation of AI agents on EvalAI for all kinds of AI tasks, i.e.reinforcement learning, supervised learning, and unsupervised learning. We offer features such as scalable backend, prioritized submission evaluation, secure test environment, and running AI agents code in an isolated sanitized environment. The end-to-end pipeline is extremely flexible, modular, and portable which can later be extended to multi-agents setups and evaluation on dynamic datasets. We also proposed a procedure using GitHub for AI challenge creation to version, maintain, and reduce the friction in this conglomerate process. Finally, we focused on providing analytics to all the users of the platform along with easing the hosting of EvalAI on private servers as an internal evaluation platform.