Title:
Towards Understanding First Order Algorithms for Nonconvex Optimization in Machine Learning
Towards Understanding First Order Algorithms for Nonconvex Optimization in Machine Learning
dc.contributor.author | Zhao, Tuo | |
dc.contributor.corporatename | Georgia Institute of Technology. Algorithms, Randomness and Complexity Center | en_US |
dc.contributor.corporatename | Georgia Institute of Technology. School of Industrial and Systems Engineering | en_US |
dc.date.accessioned | 2019-02-15T18:10:10Z | |
dc.date.available | 2019-02-15T18:10:10Z | |
dc.date.issued | 2019-02-11 | |
dc.description | Presented on February 11, 2019 at 11:00 a.m. as part of the ARC12 Distinguished Lecture in the Klaus Advanced Computing Building, Room 1116. | en_US |
dc.description | Tuo Zhao is an assistant professor in the H. Milton Stewart School of Industrial and Systems Engineering and the school of Computational Science and Engineering at Georgia Tech. His current research focuses on developing a new generation of optimization algorithms with statistical and computational guarantees, as well as user-friendly open source software for machine learning and scientific computing. | en_US |
dc.description | Runtime: 25:25 minutes | en_US |
dc.description.abstract | Stochastic Gradient Descent-type (SGD) algorithms have been widely applied to many non-convex optimization problems in machine learning, e.g., training deep neural networks, variational Bayesian inference and collaborative filtering. Due to current technical limit, however, establishing convergence properties of SGD for these highly complicated practical non-convex problems is generally infeasible. Therefore, we propose to analyze the behavior of the SGD-type algorithms through two simpler but non-trivial non-convex problems – (1) Streaming Principal Component Analysis and (2) Training Non-overlapping Two-layer Convolutional Neural Networks. Specifically, we prove that for both examples, SGD attains a sub-linear rate of convergence to the global optima with high probability. Our theory not only helps us better understand SGD, but also provides new insights on more complicated non-convex optimization problems in machine learning. | en_US |
dc.format.extent | 25:25 minutes | |
dc.identifier.uri | http://hdl.handle.net/1853/60905 | |
dc.language.iso | en_US | en_US |
dc.relation.ispartofseries | Algorithms and Randomness Center (ARC) Distinguished Lecture | |
dc.subject | Non-convex optimization | en_US |
dc.subject | Stochastic gradient descent-type (SGD) | en_US |
dc.title | Towards Understanding First Order Algorithms for Nonconvex Optimization in Machine Learning | en_US |
dc.type | Moving Image | |
dc.type.genre | Lecture | |
dspace.entity.type | Publication | |
local.contributor.author | Zhao, Tuo | |
local.contributor.corporatename | Algorithms and Randomness Center | |
local.contributor.corporatename | College of Computing | |
local.relation.ispartofseries | ARC Colloquium | |
relation.isAuthorOfPublication | f1ca0ec4-da94-4fab-8a2f-3e6ae1f9b90b | |
relation.isOrgUnitOfPublication | b53238c2-abff-4a83-89ff-3e7b4e7cba3d | |
relation.isOrgUnitOfPublication | c8892b3c-8db6-4b7b-a33a-1b67f7db2021 | |
relation.isSeriesOfPublication | c933e0bc-0cb1-4791-abb4-ed23c5b3be7e |
Files
Original bundle
1 - 4 of 4
No Thumbnail Available
- Name:
- zhao.mp4
- Size:
- 204.25 MB
- Format:
- MP4 Video file
- Description:
- Download video
No Thumbnail Available
- Name:
- zhao_videostream.html
- Size:
- 1.01 KB
- Format:
- Hypertext Markup Language
- Description:
- Streaming video
No Thumbnail Available
- Name:
- transcript.txt
- Size:
- 19.71 KB
- Format:
- Plain Text
- Description:
- Transcription
- Name:
- thumbnail.jpg
- Size:
- 45.14 KB
- Format:
- Joint Photographic Experts Group/JPEG File Interchange Format (JFIF)
- Description:
- Thumbnail
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 3.13 KB
- Format:
- Item-specific license agreed upon to submission
- Description: