Series
IDEaS Seminar Series

Series Type
Event Series
Description
Associated Organization(s)
Associated Organization(s)

Publication Search Results

Now showing 1 - 3 of 3
  • Item
    The Science of Stories: Measuring and Exploring the Ecology of Human Stories with Lexical Instruments
    ( 2019-11-06) Dodds, Peter S.
    I will survey our efforts at the Computational Story Lab to measure and study a wide array of social and cultural phenomena using “lexical meters” — online, interactive instruments that use social media and other texts to quantify population dynamics of human behavior. These include happiness, public health, obesity rates, and depression. I will explain how lexical meters work and how we have used them to uncover natural language encodings of positivity biases across cultures, universal emotional arcs of stories, links between social media posts and health, measures of fame and ultra-fame, and time compression for news. I will offer some thoughts on how fully developing a post-disciplinary, collaborative science of human stories is vital in our efforts to understand the evolution, stability, and fracturing of social systems.
  • Item
    Second Order Machine Learning
    (Georgia Institute of Technology, 2017-09-22) Mahoney, Michael
    A major challenge for large-scale machine learning, and one that will only increase in importance as we develop models that are more and more domain-informed, involves going beyond high-variance first-order optimization methods to more robust second order methods. Here, we consider the problem of minimizing the sum of a large number of functions over a convex constraint set, a problem that arises in many data analysis, machine learning, and more traditional scientific computing applications, as well as non-convex variants of these basic methods. While this is of interest in many situations, it has received attention recently due to challenges associated with training so-called deep neural networks. We establish improved bounds for algorithms that incorporate sub-sampling as a way to improve computational efficiency, while maintaining the original convergence properties of these algorithms. These methods exploit recent results from Randomized Linear Algebra on approximate matrix multiplication. Within the context of second order optimization methods, they provide quantitative convergence results for variants of Newton's methods, where the Hessian and/or the gradient is uniformly or non-uniformly sub-sampled, under much weaker assumptions than prior work.
  • Item
    Assembly of Big Genomic Data
    (Georgia Institute of Technology, 2017-09-15) Medvedev, Paul
    As genome sequencing technologies continue to facilitate the generation of large datasets, developing scalable algorithms has come to the forefront as a crucial step in analyzing these datasets. In this talk, I will discuss several recent advances, with a focus on the problem of reconstructing a genome from a set of reads (genome assembly). I will describe low-memory and scalable algorithms for automatic parameter selection and de Bruijn graph compaction, recently implemented in two tools: KmerGenie and bcalm. I will also present recent advances in the theoretical foundations of genome assemblers.