Title:
Co-evolution of shaping rewards and meta-parameters in reinforcement learning

dc.contributor.author Elfwing, Stefan
dc.contributor.author Uchibe, Eiji
dc.contributor.author Doya, Kenji
dc.contributor.author Christensen, Henrik I.
dc.contributor.corporatename Georgia Institute of Technology. College of Computing
dc.contributor.corporatename Okinawa Institute of Science and Technology. Neural Computation Unit
dc.contributor.corporatename Georgia Institute of Technology. Center for Robotics and Intelligent Machines
dc.contributor.corporatename Kungl. Tekniska Högskolan. Centrum för Autonoma System
dc.date.accessioned 2011-03-22T19:44:24Z
dc.date.available 2011-03-22T19:44:24Z
dc.date.issued 2008-12
dc.description Digital Object Identifier: 10.1177/1059712308092835 en_US
dc.description.abstract In this article, we explore an evolutionary approach to the optimization of potential-based shaping rewards and meta-parameters in reinforcement learning. Shaping rewards is a frequently used approach to increase the learning performance of reinforcement learning, with regards to both initial performance and convergence speed. Shaping rewards provide additional knowledge to the agent in the form of richer reward signals, which guide learning to high-rewarding states. Reinforcement learning depends critically on a few meta-parameters that modulate the learning updates or the exploration of the environment, such as the learning rate α, the discount factor of future rewards γ, and the temperature τ that controls the trade-off between exploration and exploitation in softmax action selection. We validate the proposed approach in simulation using the mountain-car task. We also transfer shaping rewards and meta-parameters, evolutionarily obtained in simulation, to hardware, using a robotic foraging task. en_US
dc.identifier.citation Elfwing, S., Uchibe, E., Doya, K., and Christensen, H. I. Co-evolution of shaping rewards and meta-parameters in reinforcement learning. Adaptive Behaviour 16, 8 (Dec 2008), 400-412. en_US
dc.identifier.doi 10.1177/1059712308092835
dc.identifier.issn 1059-7123
dc.identifier.uri http://hdl.handle.net/1853/38251
dc.language.iso en_US en_US
dc.publisher Georgia Institute of Technology en_US
dc.publisher.original Sage
dc.publisher.original International Society for Adaptive Behavior
dc.subject Shaping rewards en_US
dc.subject Reinforcement learning en_US
dc.title Co-evolution of shaping rewards and meta-parameters in reinforcement learning en_US
dc.type Text
dc.type.genre Article
dspace.entity.type Publication
local.contributor.author Christensen, Henrik I.
local.contributor.corporatename Institute for Robotics and Intelligent Machines (IRIM)
relation.isAuthorOfPublication afdc727f-2705-4744-945f-e7d414f2212b
relation.isOrgUnitOfPublication 66259949-abfd-45c2-9dcc-5a6f2c013bcf
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
parham_fuchs_OR11.pdf
Size:
358.92 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.76 KB
Format:
Item-specific license agreed upon to submission
Description: