Maximum Entropy Sequential Quadratic Programming
Loading...
Author(s)
Lehmann, Peter
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
Trajectory optimization (TO) plays a critical role across a broad spectrum of scientific and engineering fields, including robotics, energy and power systems, economics, and biomechanics. Through the lens of optimal control, TO often involves solving constrained optimization problems with a non-convex objective function, nonlinear state and actuation constraints, and nonlinear dynamics. A widely used technique to solve such problems is Sequential Quadratic Programming (SQP). This method iteratively solves a series of quadratic subproblems, each of which relies on first- and second-order approximations of the constraints and cost around a nominal trajectory. However, since each subproblem only captures local information, SQP can suffer from converging to local minima. To overcome this limitation, this thesis proposes Maximum Entropy Sequential Quadratic Programming (MESQP). The proposed method leverages a stochastic policy and introduces an entropy-based regularization into the objective function. This regularization encourages exploration during optimization and enables the formulation of MESQP under both unimodal and multimodal policy representations. By embedding this stochastic policy into the SQP framework, the resulting MESQP algorithm creates a scheme of alternating optimization and sampling steps, yielding the ability to overcome local minima. To evaluate the method’s efficacy, this framework is compared experimentally with regular SQP and Maximum Entropy Differential Dynamic Programming (MEDDP) across different TO tasks.
Sponsor
Date
2025-04-30
Extent
Resource Type
Text
Resource Subtype
Thesis