Policy Shaping: Integrating Human Feedback with Reinforcement Learning

Griffith, Shane; Subramanian, Kaushik; Scholz, Jonathan; Isbell, Charles L.; Thomaz, Andrea L.

Title:

Policy Shaping: Integrating Human Feedback with Reinforcement Learning

Files

nips2013.pdf (375.4 KB)

Author(s)

Griffith, Shane
Subramanian, Kaushik
Scholz, Jonathan
Isbell, Charles L.
Thomaz, Andrea L.

Authors

Person

Isbell, Charles L.

Associated Organization(s)

Organizational Unit

College of Computing

Organizational Unit

Socially Intelligent Machines Lab

Organizational Unit

Institute for Robotics and Intelligent Machines (IRIM)

Collections

Research Publications

Permanent Link

http://hdl.handle.net/1853/53270

Abstract

A long term goal of Interactive Reinforcement Learning is to incorporate non- expert human feedback to solve complex tasks. Some state-of -the-art methods have approached this problem by mapping human information to rewards and values and iterating over them to compute better control policies. In this paper we argue for an alternate, more effective characterization of human feedback: Policy Shaping. We introduce Advise, a Bayesian approach that attempts to maximize the information gained from human feedback by utilizing it as direct policy labels. We compare Advise to state-of-the-art approaches and show that it can outperform them and is robust to infrequent and inconsistent human feedback.

Date Issued

2013

Resource Type

Text

Resource Subtype

Proceedings

Full item page

Title:

Policy Shaping: Integrating Human Feedback with Reinforcement Learning

Files

Author(s)

Authors

Advisor(s)

Advisor(s)

Editor(s)

Associated Organization(s)

Series

Collections

Supplementary to

Permanent Link

Abstract

Sponsor

Date Issued

Extent

Resource Type

Resource Subtype

Rights Statement

Rights URI

Georgia Tech Library

Title: Policy Shaping: Integrating Human Feedback with Reinforcement Learning

Files

Author(s)

Authors

Advisor(s)

Advisor(s)

Editor(s)

Associated Organization(s)

Series

Collections

Supplementary to

Permanent Link

Abstract

Sponsor

Date Issued

Extent

Resource Type

Resource Subtype

Rights Statement

Rights URI

Title:

Policy Shaping: Integrating Human Feedback with Reinforcement Learning