Robot Learning from Heterogeneous Demonstration

dc.contributor.advisor Gombolay, Matthew
dc.contributor.author Chen, Letian Zac
dc.contributor.committeeMember Chernova, Sonia
dc.contributor.committeeMember Ravichandar, Harish
dc.contributor.department Computer Science
dc.date.accessioned 2021-06-10T16:48:44Z
dc.date.available 2021-06-10T16:48:44Z
dc.date.created 2020-05
dc.date.issued 2020-04-28
dc.date.submitted May 2020
dc.date.updated 2021-06-10T16:48:44Z
dc.description.abstract Learning from Demonstration (LfD) has become a ubiquitous and user-friendly technique to teach a robot how to perform a task (e.g., playing Ping Pong) without the need to use a traditional programming language (e.g., C++). As these systems are increasingly being placed in the hands of everyday users, researchers are faced with the reality that end-users are a heterogeneous population with varying levels of skills and experiences. This heterogeneity violates almost universal assumptions in LfD algorithms that demonstrations given by users are near-optimal and uniform in how the task is accomplished. In this thesis, I present algorithms to tackle two specific types of heterogeneity: heterogeneous strategy and heterogeneous performance. First, I present Multi-Strategy Reward Distillation (MSRD), which tackles the problem of learning from users who have adopted heterogeneous strategies. MSRD extracts separate task reward and strategy reward, which represents task specification and demonstrator's strategic preference, respectively. We are able to extract the task reward that has 0.998 and 0.943 correlation with ground-truth reward on two simulated robotic tasks and successfully deploy it on a real-robot table-tennis task. Second, I develop two algorithms to address the problem of learning from suboptimal demonstration: SSRR and OP-AIRL. SSRR is a novel mechanism to regress over noisy demonstrations to infer an idealized reward function. OP-AIRL is a mechanism to learn a policy that more effectively teases out ambiguity from sub-optimal demonstrations. By combining SSRR with OP-AIRL, we are able to achieve a 688% and a 254% improvement over state-of-the-art on two simulated robot tasks.
dc.description.degree M.S.
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/64653
dc.language.iso en_US
dc.publisher Georgia Institute of Technology
dc.subject Learning from demonstration
dc.subject Robot learning
dc.subject Heterogeneous learning
dc.title Robot Learning from Heterogeneous Demonstration
dc.type Text
dc.type.genre Thesis
dspace.entity.type Publication
local.contributor.corporatename College of Computing
relation.isOrgUnitOfPublication c8892b3c-8db6-4b7b-a33a-1b67f7db2021
thesis.degree.level Masters
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
4.21 MB
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
3.86 KB
Plain Text