Computational Science and Engineering Seminar Series

Series Type
Event Series
Associated Organization(s)
Associated Organization(s)

Publication Search Results

Now showing 1 - 10 of 35
  • Item
    The Aha! Moment: From Data to Insight
    (Georgia Institute of Technology, 2014-02-07) Shahaf, Dafna
    The amount of data in the world is increasing at incredible rates. Large-scale data has potential to transform almost every aspect of our world, from science to business; for this potential to be realized, we must turn data into insight. In this talk, I will describe two of my efforts to address this problem computationally. The first project, Metro Maps of Information, aims to help people understand the underlying structure of complex topics, such as news stories or research areas. Metro Maps are structured summaries that can help us understand the information landscape, connect the dots between pieces of information, and uncover the big picture. The second project proposes a framework for automatic discovery of insightful connections in data. In particular, we focus on identifying gaps in medical knowledge: our system recommends directions of research that are both novel and promising. I will formulate both problems mathematically and provide efficient, scalable methods for solving them. User studies on real-world datasets demonstrate that our methods help users acquire insight efficiently across multiple domains.
  • Item
    Cyber Games
    (Georgia Institute of Technology, 2013-02-19) Vorobeychik, Yevgeniy
    Over the last few years I have been working on game theoretic models of security, with a particular emphasis on issues salient in cyber security. In this talk I will give an overview of some of this work. I will first spend some time motivating game theoretic treatment of problems relating to cyber and describe some important modeling considerations. In the remainder, I will describe two game theoretic models (one very briefly), and associated solution techniques and analyses. The first is the "optimal attack plan interdiction" problem. In this model, we view a threat formally as a sophisticated planning agent, aiming to achieve a set of goals given some specific initial capabilities and considering a space of possible "attack actions/vectors" that may (or may not) be used towards the desired ends. The defender's goal in this setting is to "interdict" a select subset of attack vectors by optimally choosing among miti-gation options, in order to prevent the attacker from being able to achieve its goals. I will describe the formal model, explain why it is challenging, and present highly scalable decomposition-based integer programming techniques that leverage extensive research into heuristic formal planning in AI. The second model addresses the problem that defense decisions are typically decentralized. I describe a model to study the impact of decentralization, and show that there is a "sweet spot": for an intermediate number of decision makers, the joint decision is nearly socially optimal, and has the additional benefit of being robust to the changes in the environment. Finally, I will describe the Secure Design Competition (FIREAXE) that involved two teams of interns during the summer of 2012. The problem that the teams were tasked with was to design a highly stylized version of an electronic voting system. The catch was that after the design phase, each team would attempt to "attack" the other's design. I will describe some salient aspects of the specification, as well as the outcome of this competition and lessons that we (the designers and the students) learned in the process.
  • Item
    Extending Hadoop to Support Binary-Input Applications
    (Georgia Institute of Technology, 2012-10-19) Hong, Bo
    Many data-intensive applications naturally take multiple inputs, which is not well supported by some popular MapReduce implementations, such as Hadoop. In this talk, we present an extension of Hadoop to better support such applications. The extension is expected to provide the following benefits: (1) easy to program for such applications, (2) explores data localities better than native Hadoop, and (3) improves application performance.
  • Item
    Magnetic Resonance Imaging of the Brain
    (Georgia Institute of Technology, 2012-10-12) Hu, Xiaoping
    Magnetic Resonance Imaging (MRI) has become a powerful, indispensable, and ubiquitously used methodology in neuroimaging. In particularly, functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI) are two specific techniques which have broadly impacted the field. In this talk, I will briefly describe the bases and principles of these techniques and highlight several aspects of to data processing and analysis, including statistical analyses, support vector machine based classification, causal modelling and graph theoretic analysis.
  • Item
    Stochastic Gradient Descent with Only One Projection
    (Georgia Institute of Technology, 2012-09-28) Jin, Rong
    Although many variants of stochastic gradient descent have been proposed for large-scale convex optimization, most of them require projecting the solution at {\it each} iteration to ensure that the obtained solution stays within the feasible domain. For complex domains (e.g., positive semidefinite cone), the projection step can be computationally expensive, making stochastic gradient descent unattractive for large-scale optimization problems. We address this limitation by developing a novel stochastic gradient descent algorithm that does not need intermediate projections. Instead, only one projection at the last iteration is needed to obtain a feasible solution in the given domain. Our theoretical analysis shows that with a high probability, the proposed algorithms achieve an O(1/T) convergence rate for general convex optimization, and an O(lnT/T) rate for strongly convex optimization under mild conditions about the domain and the objective function.
  • Item
    High-performance-computing challenges for heart simulations
    (Georgia Institute of Technology, 2012-08-31) Fenton, Flavio H.
    The heart is an electro-mechanical system in which, under normal conditions, electrical waves propagate in a coordinated manner to initiate an efficient contraction. In pathologic states, propagation can destabilize and exhibit chaotic dynamics mostly produced by single or multiple rapidly rotating spiral/scroll waves that generate complex spatiotemporal patterns of activation that inhibit contraction and can be lethal if untreated. Despite much study, little is known about the actual mechanisms that initiate, perpetuate, and terminate spiral waves in cardiac tissue. In this talk, I will motivate the problem with some experimental examples and then discuss how we study the problem from a computational point of view, from the numerical models derived to represent the dynamics of single cells to the coupling of millions of cells to represent the three-dimensional structure of a working heart. Some of the major difficulties of computer simulations for these kinds of systems include: i) Different orders of magnitude in time scales, from milliseconds to seconds; ii) millions of degrees of freedom over millions of integration steps within irregular domains; and iii) the need for near-real-time simulations. Advances in these areas will be discussed as well as the use of GPUs over the web using webGL?
  • Item
    How much (execution) time and energy does my algorithm cost?
    (Georgia Institute of Technology, 2012-08-24) Vuduc, Richard
    When designing an algorithm or performance-tuning code, is time-efficiency (e.g., operations per second) the same as energy-efficiency (e.g., operations per Joule)? Why or why not? To answer these questions, we posit a simple strawman model of the energy to execute an algorithm. Our model is the energy-based analogue of the time-based "roofline" model of Williams, Patterson, and Waterman (Comm. ACM, 2009). What do these models imply for algorithm design? What might computer architects tell algorithm designers to help them better understand whether and how algorithm design should change in an energy-constrained computing environment?
  • Item
    Graphical Models for the Internet
    (Georgia Institute of Technology, 2011-04-29) Smola, Alexander
    In this talk I will present algorithms for performing large scale inference using Latent Dirichlet Allocation and a novel Cluster-Topic model to estimate user preferences and to group stories into coherent, topically consistent storylines. I will discuss both the statistical modeling challenges involved and the very large scale implementation of such models which allows us to perform estimation on over 50 million users on a Hadoop cluster.
  • Item
    Coordinate Sampling for Sublinear Optimization and Nearest Neighbor Search
    (Georgia Institute of Technology, 2011-04-22) Clarkson, Kenneth L.
    I will describe randomized approximation algorithms for some classical problems of machine learning, where the algorithms have provable bounds that hold with high probability. Some of our algorithms are sublinear, that is, they do not need to touch all the data. Specifically, for a set of points a[subscript 1]...a[subscript n] in d dimensions, we show that finding a d-vector x that approximately maximizes the margin min[subscript i] a[subscript i dot x can be done in O(n+d)/epsilon[superscript 2] time, up to logarithmic factors, where epsilon>0 is an additive approximation parameter. This was joint work with Elad Hazan and David Woodruff. A key step in these algorithms is the use of coordinate sampling to estimate dot products. This simple technique can be an effective alternative to random projection sketching in some settings. I will discuss the potential of coordinate sampling for speeding up some data structures for nearest neighbor searching in the Euclidean setting, via fast approximate distance evaluations.
  • Item
    Spatial Stochastic Simulation of Polarization in Yeast Mating
    (Georgia Institute of Technology, 2011-04-19) Petzold, Linda
    In microscopic systems formed by living cells, the small numbers of some reactant molecules can result in dynamical behavior that is discrete and stochastic rather than continuous and deterministic. Spatio-temporal gradients and patterns play an important role in many of these systems. In this lecture we report on recent progress in the development of computational methods and software for spatial stochastic simulation. Then we describe a spatial stochastic model of polarisome formation in mating yeast. The new model is built on simple mechanistic components, but is able to achieve a highly polarized phenotype with a relatively shallow input gradient, and to track movement in the gradient. The spatial stochastic simulations are able to reproduce experimental observations to an extent that is not possible with deterministic simulation.