Robots learning actions and goals from everyday people

Thumbnail Image
Akgun, Baris
Thomaz, Andrea L.
Associated Organization(s)
Supplementary to
Robots are destined to move beyond the caged factory floors towards domains where they will be interacting closely with humans. They will encounter highly varied environments, scenarios and user demands. As a result, programming robots after deployment will be an important requirement. To address this challenge, the field of Learning from Demonstration (LfD) emerged with the vision of programming robots through demonstrations of the desired behavior instead of explicit programming. The field of LfD within robotics has been around for more than 30 years and is still an actively researched field. However, very little research is done on the implications of having a non-robotics expert as a teacher. This thesis aims to bridge this gap by developing learning from demonstration algorithms and interaction paradigms that allow non-expert people to teach robots new skills. The first step of the thesis was to evaluate how non-expert teachers provide demonstrations to robots. Keyframe demonstrations are introduced to the field of LfD to help people teach skills to robots and compared with the traditional trajectory demonstrations. The utility of keyframes are validated by a series of experiments with more than 80 participants. Based on the experiments, a hybrid of trajectory and keyframe demonstrations are proposed to take advantage of both and a method was developed to learn from trajectories, keyframes and hybrid demonstrations in a unified way. A key insight from these user experiments was that teachers are goal oriented. They concentrated on achieving the goal of the demonstrated skills rather than providing good quality demonstrations. Based on this observation, this thesis introduces a method that can learn actions and goals from the same set of demonstrations. The action models are used to execute the skill and goal models to monitor this execution. A user study with eight participants and two skills showed that successful goal models can be learned from non- expert teacher data even if the resulting action models are not as successful. Following these results, this thesis further develops a self-improvement algorithm that uses the goal monitoring output to improve the action models, without further user input. This approach is validated with an expert user and two skills. Finally, this thesis builds an interactive LfD system that incorporates both goal learning and self-improvement and evaluates it with 12 naive users and three skills. The results suggests that teacher feedback during experiments increases skill execution and monitoring success. Moreover, non-expert data can be used as a seed to self-improvement to fix unsuccessful action models.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI