Scalable, Efficient, and Fair Algorithms for Structured Convex Optimization Problems

Thumbnail Image
Ghadiri, Mehrdad
Vempala, Santosh S.
Associated Organization(s)
Organizational Unit
Organizational Unit
Supplementary to
The growth of machine learning and data science has necessitated the development of provably fast and scalable algorithms that incorporate ethical requirements. In this thesis, we present algorithms for fundamental optimization algorithms with theoretical guarantees on approximation quality and running time. We analyze the bit complexity and stability of efficient algorithms for problems including linear regression, $p$-norm regression, and linear programming by showing that a common subroutine, inverse maintenance, is backward stable and that iterative approaches for solving constrained weighted regression problems can be carried out with bounded-error pre-conditioners. We also present conjectures regarding the running time of computing symmetric factorizations for Hankel matrices that imply faster-than-matrix-multiplication time algorithms for solving sparse poly-conditioned linear programs. We present the first subquadratic algorithm for solving the Kronecker regression problem, which improves the running time of all steps of the alternating least squares algorithm for the Tucker decomposition of tensors. In addition, we introduce the Tucker packing problem for computing an approximately optimal core shape for the Tucker decomposition problem. We prove this problem is NP-hard and provide polynomial-time approximation schemes for it. Finally, we show that the popular $k$-means clustering algorithm (Lloyd's heuristic) can result in outcomes that are unfavorable to subgroups of data. We introduce the socially fair $k$-means problem for which we provide a very efficient and practical heuristic. For the more general problem of $(\ell_p,k)$-clustering problem, we provide bicriteria constant-factor approximation algorithms. Many of our algorithms improve the state-of-the-art in practice.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI