Series
IDEaS Seminar Series
IDEaS Seminar Series
Permanent Link
Series Type
Event Series
Description
Associated Organization(s)
Associated Organization(s)
Organizational Unit
3 results
Publication Search Results
Now showing
1 - 3 of 3
-
ItemThe Science of Stories: Measuring and Exploring the Ecology of Human Stories with Lexical Instruments( 2019-11-06) Dodds, Peter S.I will survey our efforts at the Computational Story Lab to measure and study a wide array of social and cultural phenomena using “lexical meters” — online, interactive instruments that use social media and other texts to quantify population dynamics of human behavior. These include happiness, public health, obesity rates, and depression. I will explain how lexical meters work and how we have used them to uncover natural language encodings of positivity biases across cultures, universal emotional arcs of stories, links between social media posts and health, measures of fame and ultra-fame, and time compression for news. I will offer some thoughts on how fully developing a post-disciplinary, collaborative science of human stories is vital in our efforts to understand the evolution, stability, and fracturing of social systems.
-
ItemSecond Order Machine Learning(Georgia Institute of Technology, 2017-09-22) Mahoney, MichaelA major challenge for large-scale machine learning, and one that will only increase in importance as we develop models that are more and more domain-informed, involves going beyond high-variance first-order optimization methods to more robust second order methods. Here, we consider the problem of minimizing the sum of a large number of functions over a convex constraint set, a problem that arises in many data analysis, machine learning, and more traditional scientific computing applications, as well as non-convex variants of these basic methods. While this is of interest in many situations, it has received attention recently due to challenges associated with training so-called deep neural networks. We establish improved bounds for algorithms that incorporate sub-sampling as a way to improve computational efficiency, while maintaining the original convergence properties of these algorithms. These methods exploit recent results from Randomized Linear Algebra on approximate matrix multiplication. Within the context of second order optimization methods, they provide quantitative convergence results for variants of Newton's methods, where the Hessian and/or the gradient is uniformly or non-uniformly sub-sampled, under much weaker assumptions than prior work.
-
ItemAssembly of Big Genomic Data(Georgia Institute of Technology, 2017-09-15) Medvedev, PaulAs genome sequencing technologies continue to facilitate the generation of large datasets, developing scalable algorithms has come to the forefront as a crucial step in analyzing these datasets. In this talk, I will discuss several recent advances, with a focus on the problem of reconstructing a genome from a set of reads (genome assembly). I will describe low-memory and scalable algorithms for automatic parameter selection and de Bruijn graph compaction, recently implemented in two tools: KmerGenie and bcalm. I will also present recent advances in the theoretical foundations of genome assemblers.