Organizational Unit:
Transdisciplinary Research Institute for Advancing Data Science
Transdisciplinary Research Institute for Advancing Data Science
Permanent Link
Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Includes Organization(s)
ArchiveSpace Name Record
Publication Search Results
Now showing
1  10 of 11

ItemLecture 5: Inference and Uncertainty Quantification for Noise Matrix Completion( 20190905) Chen, YuxinNoisy matrix completion aims at estimating a lowrank matrix given only partial and corrupted entries. Despite substantial progress in designing efficient estimation algorithms, it remains largely unclear how to assess the uncertainty of the obtained estimates and how to perform statistical inference on the unknown matrix (e.g. constructing a valid and short confidence interval for an unseen entry). This talk takes a step towards inference and uncertainty quantification for noisy matrix completion. We develop a simple procedure to compensate for the bias of the widely used convex and nonconvex estimators. The resulting debiased estimators admit nearly precise nonasymptotic distributional characterizations, which in turn enable optimal construction of confidence intervals / regions for, say, the missing entries and the lowrank factors. Our inferential procedures do not rely on sample splitting, thus avoiding unnecessary loss of data efficiency. As a byproduct, we obtain a sharp characterization of the estimation accuracy of our debiased estimators, which, to the best of our knowledge, are the first tractable algorithms that provably achieve full statistical efficiency (including the preconstant). The analysis herein is built upon the intimate link between convex and nonconvex optimization. This is joint work with Cong Ma, Yuling Yan, Yuejie Chi, and Jianqing Fan.

ItemLecture 4: Spectral Methods Meets Asymmetry: Two Recent Stories( 20190904) Chen, YuxinThis talk is concerned with the interplay between asymmetry and spectral methods. Imagine that we have access to an asymmetrically perturbed lowrank data matrix. We attempt estimation of the lowrank matrix via eigendecomposition  an uncommon approach when dealing with nonsymmetric matrices. We provide two recent stories to demonstrate the advantages and effectiveness of this approach. The first story is concerned with topK ranking from pairwise comparisons, for which the spectral method enables unimprovable ranking accuracy. The second story is concern with matrix denoising and spectral estimation, for which the eigendecomposition method significantly outperforms the (unadjusted) SVDbased approach and is fully adaptive to heteroscedasticity without the need of careful bias correction. The first part of this talk is based on joint work with Cong Ma, Kaizheng Wang, and Jianqing Fan; the second part of this talk is based on joint work with Chen Cheng and Jianqing Fan.

ItemLecture 3: Projected Power Method: An Efficient Algorithm for Joint Discrete Assignment( 20190903) Chen, YuxinVarious applications involve assigning discrete label values to a collection of objects based on some pairwise noisy data. Due to the discreteand hence nonconvexstructure of the problem, computing the optimal assignment (e.g. maximum likelihood assignment) becomes intractable at first sight. This paper makes progress towards efficient computation by focusing on a concrete joint discrete alignment problemthat is, the problem of recovering n discrete variables given noisy observations of their modulo differences. We propose a lowcomplexity and modelfree procedure, which operates in a lifted space by representing distinct label values in orthogonal directions, and which attempts to optimize quadratic functions over hypercubes. Starting with a first guess computed via a spectral method, the algorithm successively refines the iterates via projected power iterations. We prove that for a broad class of statistical models, the proposed projected power method makes no errorand hence converges to the maximum likelihood estimatein a suitable regime. Numerical experiments have been carried out on both synthetic and real data to demonstrate the practicality of our algorithm. We expect this algorithmic framework to be effective for a broad range of discrete assignment problems. This is joint work with Emmanuel Candes.

ItemLecture 2: Random initialization and implicit regularization in nonconvex statistical estimation( 20190829) Chen, YuxinRecent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation / learning problems. Due to the highly nonconvex nature of the empirical loss, stateoftheart procedures often require suitable initialization and proper regularization (e.g. trimming, regularized cost, projection) in order to guarantee fast convergence. For vanilla procedures such as gradient descent, however, prior theory is often either far from optimal or completely lacks theoretical guarantees. This talk is concerned with a striking phenomenon arising in two nonconvex problems (i.e. phase retrieval and matrix completion): even in the absence of careful initialization, proper saddle escaping, and/or explicit regularization, gradient descent converges to the optimal solution within a logarithmic number of iterations, thus achieving nearoptimal statistical and computational guarantees at once. All of this is achieved by exploiting the statistical models in analyzing optimization algorithms, via a leaveoneout approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data. As a byproduct, for noisy matrix completion, we demonstrate that gradient descent achieves nearoptimal entrywise error control. This is joint work with Cong Ma, Kaizheng Wang, Yuejie Chi, and Jianqing Fan

ItemLecture 1: The power of nonconvex optimization in solving random quadratic systems of equations( 20190828) Chen, YuxinWe consider the fundamental problem of solving random quadratic systems of equations in n variables, which spans many applications ranging from the centuryold phase retrieval problem to various latentvariable models in machine learning. A growing body of recent work has demonstrated the effectiveness of convex relaxation  in particular, semidefinite programming  for solving problems of this kind. However, the computational cost of such convex paradigms is often unsatisfactory, which limits applicability to largedimensional data. This talk follows another route: by formulating the problem into nonconvex programs, we attempt to optimize the nonconvex objectives directly. We demonstrate that for certain unstructured models of quadratic systems, nonconvex optimization algorithms return the correct solution in linear time, as soon as the ratio between the number of equations and unknowns exceeds a fixed numerical constant. We extend the theory to deal with noisy systems, and prove that our algorithms achieve a minimax optimal statistical accuracy. Numerical evidence suggests that the computational cost of our algorithm is about four times that of solving a leastsquares problem of the same size. This is joint work with Emmanuel Candes.

ItemVisual Data Analytics: A Short Tutorial( 20190808) Chau, Duen Horng

ItemLecture 5: Mathematics for Deep Neural Networks: Energy landscape and open problems( 20190318) SchmidtHieber, JohannesTo derive a theory for gradient descent methods, it is important to have some understanding of the energy landscape. In this lecture, an overview of existing results is given. The second part of the lecture is devoted to future challenges in the field. We describe important future steps needed for the future development of the statistical theory of deep networks.

ItemLecture 4: Mathematics for Deep Neural Networks: Statistical theory for deep ReLU networks( 20190315) SchmidtHieber, JohannesWe outline the theory underlying the recent bounds on the estimation risk of deep ReLU networks. In the lecture, we discuss specific properties of the ReLU activation function that relate to skipping connections and efficient approximation of polynomials. Based on this, we show how risk bounds can be obtained for sparsely connected networks.

ItemLecture 3: Mathematics for Deep Neural Networks: Advantages of Additional Layers( 20190313) SchmidtHieber, JohannesWhy are deep networks better than shallow networks? We provide a survey of the existing ideas in the literature. In particular, we discuss localization of deep networks, functions that can be easily approximated by deep networks and finally discuss the KolmogorovArnold representation theorem.

ItemLecture 2: Mathematics for Deep Neural Networks: Theory for shallow networks( 20190308) SchmidtHieber, JohannesWe start with the universal approximation theorem and discuss several proof strategies that provide some insights into functions that can be easily approximated by shallow networks. Based on this, a survey on approximation rates for shallow networks is given. It is shown how this leads to estimation rates. In the lecture, we also discuss methods that fit shallow networks to data.