Evolutionary Development of Hierarchical Learning Structures
Author(s)
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
Hierarchical reinforcement learning (RL) algorithms
can learn a policy faster than standard RL algorithms. However,
the applicability of hierarchical RL algorithms is limited by the
fact that the task decomposition has to be performed in advance
by the human designer. We propose a Lamarckian evolutionary
approach for automatic development of the learning structure in
hierarchical RL. The proposed method combines the MAXQ hierarchical
RL method and genetic programming (GP). In the MAXQ
framework, a subtask can optimize the policy independently of its
parent task's policy, which makes it possible to reuse learned policies
of the subtasks. In the proposed method, the MAXQ method
learns the policy based on the task hierarchies obtained by GP,
while the GP explores the appropriate hierarchies using the result
of the MAXQ method. To show the validity of the proposed method,
we have performed simulation experiments for a foraging task in
three different environmental settings. The results show strong interconnection
between the obtained learning structures and the
given task environments. The main conclusion of the experiments
is that the GP can find a minimal strategy, i.e., a hierarchy that
minimizes the number of primitive subtasks that can be executed
for each type of situation. The experimental results for the most
challenging environment also show that the policies of the subtasks
can continue to improve, even after the structure of the hierarchy
has been evolutionary stabilized, as an effect of Lamarckian mechanisms.
Sponsor
Date
2007-04
Extent
Resource Type
Text
Resource Subtype
Article