Improving Model-Predictive Control with Value Function Approximation

Loading...
Thumbnail Image
Author(s)
Chintalapudi, Sahit
Advisor(s)
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
School of Computer Science
School established in 2007
Supplementary to:
Abstract
Existing Model Predictive Control methods rely on finite-horizon trajectories from the environment. Such methods are limited by the length of the samples because the robot cannot plan for scenarios beyond this time horizon. Simply extending the time-horizon of sampled trajectories is not feasible as an increase in the time-horizon requires more sampled trajectories from the environment in order to maintain controller performance. On robots such as the AutoRally platform, which operate in real time with limited computational power, increasing the number of sampled trajectories is computationally intractable. This work improves the long-term planning capabilities of autonomous systems by augmenting cost-estimates of trajectories with a learned value of the terminal state. This learned value approximates the expected cost under the car's current control policy from the terminal state for an arbitrary time-horizon without requiring an increase in the number of samples. We show that this improves the lap times of the AutoRally platform.
Sponsor
Date
2019-12
Extent
Resource Type
Text
Resource Subtype
Undergraduate Thesis
Rights Statement
Rights URI