Towards Safe and Efficient Learning for Dexterous Manipulation

dc.contributor.advisor Ravichandar, Harish
dc.contributor.author Jain, Abhineet
dc.contributor.committeeMember Garg, Animesh
dc.contributor.committeeMember Xu, Danfei
dc.contributor.department Computer Science
dc.date.accessioned 2023-05-18T17:57:35Z
dc.date.available 2023-05-18T17:57:35Z
dc.date.created 2023-05
dc.date.issued 2023-05-02
dc.date.submitted May 2023
dc.date.updated 2023-05-18T17:57:36Z
dc.description.abstract Imitation learning (IL) is a promising approach to help robots acquire dexterous manipulation capabilities without the need for a carefully-designed reward or significant computational effort. However, existing IL approaches require sophisticated data collection infrastructure and struggle to generalize beyond the training distribution. Other approaches like reinforcement learning interact with the environment to train black-box neural networks, providing little control over how the robot learns and performs the skills. Such approaches are especially challenging on physical platforms - a robot’s erratic behavior poses harm not only to itself but also the entities around it. Since training uses a large number of interactions with the environment, there is also room for improving sample efficiency. In this work, we address these challenges. First, we demonstrate that we can learn an object relocation task using demonstrations from LeapMotion, an inexpensive vision-based sensor. Policies using these demonstrations show similar success to policies from wearable sensors, at the cost of sample efficiency. This reduces the setup cost to collect demonstrations by approximately 140x. Second, we investigate the importance of collecting additional data that better represents the full operating conditions. We compare the performance of corrective additional demonstrations and randomly-sampled additional demonstrations for an object relocation task. When there are more additional demonstrations from the full task distribution than the original demonstrations from a restrictive training distribution, the corrective demonstrations considerably outperform the randomly-sampled ones. Otherwise, there are no significant differences between the two. Third, we introduce a simple geometric constraint to guide the robot when learning object relocation. Using Constrained Policy Optimization, the robot can quickly learn to move towards the object, and uses similar number of samples to learn the skill as the unconstrained approach. We show how simple constraints can help robots achieve sensible and safe behavior quickly and ease concerns surrounding hardware deployment. We also provide insights into how different degrees of strictness of these constraints affect learning. Finally, we curate a library of constraints generalizable across multiple dexterous manipulation tasks, and introduce a hierarchical approach that prioritizes these constraints across different phases of the task. We train two policies to learn an object relocation task - a low-level policy that learns how to perform the task given a set of constraints, and a high-level policy that decides what constraints to activate during different stages of training the low-level policy. With this hierarchical approach, the agent learns to perform the task while reducing the duration of unsafe behaviors during training. Our findings indicate that prioritizing the right constraints addresses physical and environmental safety concerns as the hierarchical policy can both train and perform the tasks safely.
dc.description.degree M.S.
dc.format.mimetype application/pdf
dc.identifier.uri https://hdl.handle.net/1853/72084
dc.language.iso en_US
dc.publisher Georgia Institute of Technology
dc.subject Dexterous manipulation
dc.subject Reinforcement learning
dc.subject Learning from demonstration
dc.subject Safe reinforcement learning
dc.subject Learning with constraints
dc.subject Constraints scheduling
dc.title Towards Safe and Efficient Learning for Dexterous Manipulation
dc.type Text
dc.type.genre Thesis
dspace.entity.type Publication
local.contributor.advisor Ravichandar, Harish
local.contributor.corporatename College of Computing
local.contributor.corporatename School of Computer Science
relation.isAdvisorOfPublication 07277d29-f32f-4435-b617-f76d10fb8f6a
relation.isOrgUnitOfPublication c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isOrgUnitOfPublication 6b42174a-e0e1-40e3-a581-47bed0470a1e
thesis.degree.level Masters
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
6.88 MB
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
3.87 KB
Plain Text