Theory and applications of first-order methods for convex optimization with function constraints

Thumbnail Image
Zhou, Zhiqiang
Romeijn, Edwin
Associated Organization(s)
Supplementary to
This dissertation focuses on the development of efficient first-order methods for function constrained convex optimization and their applications in a few different areas, including healthcare, finance and machine learning. The thesis consists of three major studies. The first part of the thesis considers the problem of minimizing an expectation function over a closed convex set, coupled with a functional or expectation constraint on either decision variables or problem parameters. We first present a new stochastic approximation (SA) type algorithm, namely the cooperative SA (CSA), to handle problems with the constraint on devision variables. We show that this algorithm exhibits the optimal ${\cal O}(1/\epsilon^2)$ rate of convergence, in terms of both optimality gap and constraint violation, when the objective and constraint functions are generally convex, where $\epsilon$ denotes the optimality gap and infeasibility. Moreover, we show that this rate of convergence can be improved to ${\cal O}(1/\epsilon)$ if the objective and constraint functions are strongly convex. We then present a variant of CSA, namely the cooperative stochastic parameter approximation (CSPA) algorithm, to deal with the situation when the constraint is defined over problem parameters and show that it exhibits similar optimal rate of convergence to CSA. It is worth noting that CSA and CSPA are primal methods which do not require the iterations on the dual space and/or the estimation on the size of the dual variables. To the best of our knowledge, this is the first time that such optimal SA methods for solving functional or expectation constrained stochastic optimization are presented in the literature. In addition, we apply the CSA and CSPA methods to an asset allocation problem, and a combined classification and metric learning problem, respectively. The second part of the thesis is devoted to conditional gradient methods which have attracted much attention in both machine learning and optimization communities recently. These simple methods can guarantee the generation of sparse solutions. In addition, without the computation of full gradients, they can handle huge-scale problems sometimes even with an exponentially increasing number of decision variables. This study aims to significantly expand the application areas of these methods by presenting new conditional gradient methods for solving convex optimization problems with general affine and nonlinear constraints. More specifically, we first present a new constraint extrapolated condition gradient (CoexCG) method that can achieve an ${\cal O}(1/\epsilon^2)$ iteration complexity for both smooth and structured nonsmooth function constrained convex optimization. We further develop novel variants of CoexCG, namely constraint extrapolated and dual regularized conditional gradient (CoexDurCG) methods, that can achieve similar iteration complexity to CoexCG but allow adaptive selection for algorithmic parameters. We illustrate the effectiveness of these methods for solving an important class of radiation therapy treatment planning problems arising from healthcare industry. In the third part of the thesis, we extend the convex function constrained optimization to the multi-stage setting, i.e., multi-stage stochastic optimization problems with convex objectives and conic constraints at each stage. We present a new stochastic first-order method, namely the dynamic stochastic approximation (DSA) algorithm, for solving these types of stochastic optimization problems. We show that DSA can achieve an optimal ${\cal O}(1/\epsilon^4)$ rate of convergence in terms of the total number of required scenarios when applied to a three-stage stochastic optimization problem. We further show that this rate of convergence can be improved to ${\cal O}(1/\epsilon^2)$ when the objective function is strongly convex. We also discuss variants of DSA for solving more general multi-stage stochastic optimization problems with the number of stages $T > 3$. The developed DSA algorithms only need to go through the scenario tree once in order to compute an $\epsilon$-solution of the multi-stage stochastic optimization problem. As a result, the memory required by DSA only grows linearly with respect to the number of stages. To the best of our knowledge, this is the first time that stochastic approximation type methods are generalized for multi-stage stochastic optimization with $T \gep 3$. We apply the DSA method for solving a class of multi-stage asset allocation problem and demonstrate its potential advantages over existing methods, especially when the planning horizon $T$ is relatively short but the number of assets is large.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI