Structured Statistical Estimation via Optimization

Thumbnail Image
Mcrae, Andrew Duncan
Davenport, Mark A.
Associated Organizations
Supplementary to
This thesis shows how we can exploit low-dimensional structure in high-dimensional statistics and machine learning problems via optimization. We show several settings where, with an appropriate choice of optimization algorithm, we can perform useful estimation with a complexity that scales not with the original problem dimension but with a much smaller intrinsic dimension. In the low-rank matrix completion and denoising problems, we can exploit low-rank structure to recover a large matrix from noisy observations of some or all of its entries. We prove state-of-the-art results for this problem in the case of Poisson noise and show that these results are minimax-optimal. Next, we study the problem of recovering a sparse vector from nonlinear measurements. We present a lifted matrix framework for the sparse phase retrieval and sparse PCA problems that includes a novel atomic norm regularizer. We prove that solving certain convex optimization problems in this framework yields estimators with near-optimal performance. Although we do not know how to compute these estimators efficiently and exactly, we derive a principled heuristic algorithm for sparse phase retrieval that matches existing state-of-the-art algorithms. Third, we show how we can exploit low-dimensional manifold structure in supervised learning. In a reproducing kernel Hilbert space framework, we show that smooth functions on a manifold can be estimated with a complexity scaling with the manifold dimension rather than a larger embedding space dimension. Finally, we study the interaction between high ambient dimension and a lower intrinsic dimension in the harmless interpolation phenomenon (where learned functions generalize well despite interpolating noisy data). We present a general framework for this phenomenon in linear and reproducing kernel Hilbert space settings, proving that it occurs in many situations that previous work has not covered.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI