Title:
Towards Understanding First Order Algorithms for Nonconvex Optimization in Machine Learning

Thumbnail Image
Author(s)
Zhao, Tuo
Authors
Person
Advisor(s)
Advisor(s)
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
Series
Collections
Supplementary to
Abstract
Stochastic Gradient Descent-type (SGD) algorithms have been widely applied to many non-convex optimization problems in machine learning, e.g., training deep neural networks, variational Bayesian inference and collaborative filtering. Due to current technical limit, however, establishing convergence properties of SGD for these highly complicated practical non-convex problems is generally infeasible. Therefore, we propose to analyze the behavior of the SGD-type algorithms through two simpler but non-trivial non-convex problems – (1) Streaming Principal Component Analysis and (2) Training Non-overlapping Two-layer Convolutional Neural Networks. Specifically, we prove that for both examples, SGD attains a sub-linear rate of convergence to the global optima with high probability. Our theory not only helps us better understand SGD, but also provides new insights on more complicated non-convex optimization problems in machine learning.
Sponsor
Date Issued
2019-02-11
Extent
25:25 minutes
Resource Type
Moving Image
Resource Subtype
Lecture
Rights Statement
Rights URI