Series
ARC Colloquium
ARC Colloquium
Permanent Link
Series Type
Event Series
Description
Associated Organization(s)
Associated Organization(s)
Publication Search Results
Now showing
1 - 10 of 88
-
ItemOptimal Sensory Coding Theories for Neural Systems Under Biophysical Constraints(Georgia Institute of Technology, 2017-03-15) Rozell, Christopher J. ; Georgia Institute of Technology. Algorithms & Randomness Center ; Georgia Institute of Technology. Neural Engineering CenterThe natural stimuli that biological vision must use to understand the world are extremely complex. Recent advances in machine learning have shown that low-dimensional geometric models (e.g., sparsity, manifolds) can capture much of the structure in complex natural images. I will describe our work building efficient neural coding models that optimally exploit this structure. These results incorporate the constraints of biophysical systems and the physical world by drawing on mathematical tools such as dynamical systems, optimization, unsupervised learning, randomized dimensionality reduction, and manifold learning. These results show that incorporating natural constraints can lead to theoretical models that account for a wide range of observed phenomenon, including complex response properties of individual neurons, architectural features of the network (e.g., makeup of different cell types), and reported perceptual results from human psychophysical experiments.
-
ItemTowards Understanding First Order Algorithms for Nonconvex Optimization in Machine Learning( 2019-02-11) Zhao, Tuo ; Georgia Institute of Technology. Algorithms, Randomness and Complexity Center ; Georgia Institute of Technology. School of Industrial and Systems EngineeringStochastic Gradient Descent-type (SGD) algorithms have been widely applied to many non-convex optimization problems in machine learning, e.g., training deep neural networks, variational Bayesian inference and collaborative filtering. Due to current technical limit, however, establishing convergence properties of SGD for these highly complicated practical non-convex problems is generally infeasible. Therefore, we propose to analyze the behavior of the SGD-type algorithms through two simpler but non-trivial non-convex problems – (1) Streaming Principal Component Analysis and (2) Training Non-overlapping Two-layer Convolutional Neural Networks. Specifically, we prove that for both examples, SGD attains a sub-linear rate of convergence to the global optima with high probability. Our theory not only helps us better understand SGD, but also provides new insights on more complicated non-convex optimization problems in machine learning.
-
ItemSphere packings, codes, and kissing numbers via hard core models( 2018-05-17) Perkins, Will ; Georgia Institute of Technology. Algorithms, Randomness and Complexity Center ; University of Birmingham. School of MathematicsWe prove a lower bound on the expected size of a spherical code drawn from a "hard cap" model and the expected density of a packing drawn from the hard sphere model in high dimensions. These results allows us to improve the lower bound on the kissing number in high dimensions by a factor d and to prove new lower bounds on the entropy of sphere packings of density \Theta(d 2^{-d}) in R^d. Joint work with Matthew Jenssen and Felix Joos.
-
ItemRobust Mean Estimation in Nearly-Linear Time( 2019-12-02) Hopkins, Samuel ; Georgia Institute of Technology. Algorithms, Randomness and Complexity Center ; University of California, Berkeley. Dept. of Electrical Engineering and Computer SciencesRobust mean estimation is the following basic estimation question: given i.i.d. copies of a random vector X in d-dimensional Euclidean space of which a small constant fraction are corrupted, how well can you estimate the mean of the distribution? This is a classical problem in statistics, going back to the 60's and 70's, and has recently found application to many problems in reliable machine learning. However, in high dimensions, classical algorithms for this problem either were (1) computationally intractable, or (2) lost poly(d) factors in their accuracy guarantees. Recently, polynomial time algorithms have been demonstrated for this problem that still achieve (nearly) optimal error guarantees. However, the running times of these algorithms were either at least quadratic in dimension or in 1/(desired accuracy), running time overhead which renders them ineffective in practice. In this talk we give the first truly nearly linear time algorithm for robust mean estimation which achieves nearly optimal statistical performance. Our algorithm is based on the matrix multiplicative weights method. Based on joint work with Yihe Dong and Jerry Li, to appear in NeurIPS 2019.
-
ItemStatistical Query Lower Bounds for High-Dimensional Unsupervised Learning( 2017-09-18) Diakonikolas, Ilias ; Georgia Institute of Technology. Algorithms, Randomness and Complexity Center ; University of Southern California. Dept. of Computer ScienceWe describe a general technique that yields the first Statistical Query lower bounds for a range of fundamental high-dimensional learning problems. Our main results are for the problems of (1) learning Gaussian mixture models, and (2) robust learning of a single Gaussian distribution. For these problems, we show a super-polynomial gap between the sample complexity and the computational complexity of any Statistical Query (SQ) algorithm for the problem. SQ algorithms are a class of algorithms that are only allowed to query expectations of functions of the distribution rather than directly access samples. This class of algorithms is quite broad: a wide range of known algorithmic techniques in machine learning are known to be implementable using SQs. Our SQ lower bounds are attained via a unified moment-matching technique that is useful in other contexts. Our method yields tight lower bounds for a number of related unsupervised estimation problems, including robust covariance estimation in spectral norm, and robust sparse mean estimation. Finally, for the classical problem of robustly testing an unknown mean Gaussian, we show a sample complexity lower bound that scales linearly in the dimension. This matches the sample complexity of the corresponding robust learning problem and separates the sample complexity of robust testing from standard testing. This separation is surprising because such a gap does not exist for the corresponding learning problem. (Based on joint work with Daniel Kane (UCSD) and Alistair Stewart (USC).)
-
ItemRapidly Mixing Random Walks via Log-Concave Polynomials (Part 1)( 2019-11-05) Anari, Nima ; Georgia Institute of Technology. Algorithms, Randomness and Complexity Center ; Stanford University. Dept. of Computer ScienceA fundamental tool used in sampling, counting, and inference problems is the Markov Chain Monte Carlo method, which uses random walks to solve computational problems. The main parameter defining the efficiency of this method is how quickly the random walk mixes (converges to the stationary distribution). The goal of these talks is to introduce a new approach for analyzing the mixing time of random walks on high-dimensional discrete objects. This approach works by directly relating the mixing time to analytic properties of a certain multivariate generating polynomial. As our main application we will analyze basis-exchange random walks on the set of bases of a matroid. We will show that the corresponding multivariate polynomial is log-concave over the positive orthant, and use this property to show three progressively improving mixing time bounds: For a matroid of rank r on a ground set of n elements: - We will first show a mixing time of O(r^2 log n) by analyzing the spectral gap of the random walk (based on related works on high-dimensional expanders). - Then we will show a mixing time of O(r log r + r log log n) based on the modified log-sobolev inequality (MLSI), due to Cryan, Guo, Mousa. - We will then completely remove the dependence on n, and show the tight mixing time of O(r log r), by appealing to variants of well-studied notions in discrete convexity. Time-permitting, I will discuss further recent developments, including relaxed notions of log-concavity of a polynomial, and applications to further sampling/counting problems. Based on joint works with Kuikui Liu, Shayan OveisGharan, and Cynthia Vinzant.
-
ItemSolving Linear Programs in the Current Matrix Multiplication Time( 2019-05-20) Lee, Yin Tat ; Georgia Institute of Technology. Algorithms, Randomness and Complexity Center ; University of Washington. School of Computer Science and EngineeringWe show how to solve linear programs with accuracy epsilon in time n^{omega+o(1)} log(1/epsilon) where omega~2.3728639 is the current matrix multiplication constant. This hits a natural barrier of solving linear programs since linear systems is a special case of linear programs and solving linear systems require time n^{omega} currently. Joint work with Michael B. Cohen and Zhao Song.
-
ItemThe Paulsen problem, continuous operator scaling, and smoothed analysis( 2018-10-15) Lau, Lap Chi ; Georgia Institute of Technology. Algorithms, Randomness and Complexity Center ; University of Waterloo. School of Computer ScienceThe Paulsen problem is a basic open problem in operator theory. We define a continuous version of the operator scaling algorithm to solve this problem. A key step is to show that the continuous operator scaling algorithm converges faster in a perturbed input. To this end, we develop some new techniques in lower bounding the operator capacity, a concept introduced by Gurvits to analyze the operator scaling algorithm. The talk will be self-contained. Joint work with Tsz Chiu Kwok, Yin Tat Lee, and Akshay Ramachandran.
-
ItemAlgorithmic Pirogov-Sinai theory( 2018-11-05) Perkins, Will ; Georgia Institute of Technology. Algorithms, Randomness and Complexity Center ; University of Illinois at Chicago. Dept. of Mathematics, Statistics, and Computer ScienceWe develop efficient algorithms to approximate the partition function and sample from the hard-core and Potts models on lattices at sufficiently low temperatures in the phase coexistence regime. In contrast, the Glauber dynamics are known to take exponential time to mix in this regime. Our algorithms are based on the cluster expansion and Pirogov-Sinai theory, classical tools from statistical physics for understanding phase transitions, as well as Barvinok's approach to polynomial approximation. Joint work with Tyler Helmuth and Guus Regts.
-
ItemThe Contextual Bandits Problem: Techniques for Learning to Make High-Reward Decisions( 2017-10-30) Schapire, Robert ; Georgia Institute of Technology. Algorithms, Randomness and Complexity Center ; Microsoft ResearchWe consider how to learn through experience to make intelligent decisions. In the generic setting, called the contextual bandits problem, the learner must repeatedly decide which action to take in response to an observed context, and is then permitted to observe the received reward, but only for the chosen action. The goal is to learn to behave nearly as well as the best policy (or decision rule) in some possibly very large and rich space of candidate policies. This talk will describe progress on developing general methods for this problem and some of its variants.