Series
Computational Science and Engineering Seminar Series

Series Type
Event Series
Description
Associated Organization(s)
Associated Organization(s)

Publication Search Results

Now showing 1 - 10 of 10
  • Item
    Graphical Models for the Internet
    (Georgia Institute of Technology, 2011-04-29) Smola, Alexander
    In this talk I will present algorithms for performing large scale inference using Latent Dirichlet Allocation and a novel Cluster-Topic model to estimate user preferences and to group stories into coherent, topically consistent storylines. I will discuss both the statistical modeling challenges involved and the very large scale implementation of such models which allows us to perform estimation on over 50 million users on a Hadoop cluster.
  • Item
    Coordinate Sampling for Sublinear Optimization and Nearest Neighbor Search
    (Georgia Institute of Technology, 2011-04-22) Clarkson, Kenneth L.
    I will describe randomized approximation algorithms for some classical problems of machine learning, where the algorithms have provable bounds that hold with high probability. Some of our algorithms are sublinear, that is, they do not need to touch all the data. Specifically, for a set of points a[subscript 1]...a[subscript n] in d dimensions, we show that finding a d-vector x that approximately maximizes the margin min[subscript i] a[subscript i dot x can be done in O(n+d)/epsilon[superscript 2] time, up to logarithmic factors, where epsilon>0 is an additive approximation parameter. This was joint work with Elad Hazan and David Woodruff. A key step in these algorithms is the use of coordinate sampling to estimate dot products. This simple technique can be an effective alternative to random projection sketching in some settings. I will discuss the potential of coordinate sampling for speeding up some data structures for nearest neighbor searching in the Euclidean setting, via fast approximate distance evaluations.
  • Item
    Spatial Stochastic Simulation of Polarization in Yeast Mating
    (Georgia Institute of Technology, 2011-04-19) Petzold, Linda
    In microscopic systems formed by living cells, the small numbers of some reactant molecules can result in dynamical behavior that is discrete and stochastic rather than continuous and deterministic. Spatio-temporal gradients and patterns play an important role in many of these systems. In this lecture we report on recent progress in the development of computational methods and software for spatial stochastic simulation. Then we describe a spatial stochastic model of polarisome formation in mating yeast. The new model is built on simple mechanistic components, but is able to achieve a highly polarized phenotype with a relatively shallow input gradient, and to track movement in the gradient. The spatial stochastic simulations are able to reproduce experimental observations to an extent that is not possible with deterministic simulation.
  • Item
    Optimization for Machine Learning: SMO-MKL and Smoothing Strategies
    (Georgia Institute of Technology, 2011-04-15) Vishwanathan, S. V. N.
    Our objective is to train $p$-norm Multiple Kernel Learning (MKL) and, more generally, linear MKL regularised by the Bregman divergence, using the Sequential Minimal Optimization (SMO) algorithm. The SMO algorithm is simple, easy to implement and adapt, and efficiently scales to large problems. As a result, it has gained widespread acceptance and SVMs are routinely trained using SMO in diverse real world applications. Training using SMO has been a long standing goal in MKL for the very same reasons. Unfortunately, the standard MKL dual is not differentiable, and therefore can not be optimised using SMO style co-ordinate ascent. In this paper, we demonstrate that linear MKL regularised with the $p$-norm squared, or with certain Bregman divergences, can indeed be trained using SMO. The resulting algorithm retains both simplicity and efficiency and is significantly faster than the state-of-the-art specialised $p$-norm MKL solvers. We show that we can train on a hundred thousand kernels in less than fifteen minutes and on fifty thousand points in nearly an hour on a single core using standard hardware.
  • Item
    Mining Billion-Node Graphs: Patterns, Generators, and Tools
    (Georgia Institute of Technology, 2011-04-08) Faloutsos, Christos
    What do graphs look like? How do they evolve over time? How to handle a graph with a billion nodes? We present a comprehensive list of static and temporal laws, and some recent observations on real graphs (like, e.g., “eigenSpokes”). For generators, we describe some recent ones, which naturally match all of the known properties of real graphs. Finally, for tools, we present “oddball” for discovering anomalies and patterns, as well as an overview of the PEGASUS system which is designed for handling Billion-node graphs, running on top of the “hadoop” system.
  • Item
    Multicore-oblivious Algorithms
    (Georgia Institute of Technology, 2011-03-28) Chowdhury, Rezaul Alam
    Multicores represent a paradigm shift in general-purpose computing away from the von Neumann model to a collection of cores on a chip communicating through a cache hierarchy under a shared memory. Designing efficient algorithms for multicores is more challenging than that for traditional serial machines, as one must address both caching issues and shared-memory parallelism. As multicores with a wide range of machine parameters rapidly become the default desktop configuration, the need for efficient, portable code for them is growing. This talk will mainly address the design of efficient algorithms for multicores that are oblivious to machine parameters, and thus are portable across machines with different multicore configurations. We consider HM, a multicore model consisting of a parallel shared-memory machine with hierarchical multi-level caching, and we introduce a multicore-oblivious (MO) approach to algorithms and schedulers for HM. An MO algorithm is specified with no mention of any machine parameters, such as the number of cores, number of cache levels, cache sizes and block lengths. However, it is equipped with a small set of instructions that can be used to provide hints to the run-time scheduler on how to schedule parallel tasks. We present efficient MO algorithms for several fundamental problems including matrix transposition, FFT, sorting, the Gaussian Elimination Paradigm, list ranking, and connected components. The notion of an MO algorithm is complementary to that of a network-oblivious (NO) algorithm, recently introduced by Bilardi et al. for parallel distributed-memory machines where processors communicate point to-point. Indeed several of our MO algorithms translate into efficient NO algorithms, adding to the body of known efficient NO algorithms. Towards the end of this talk I will give a brief overview of some of my recent work related to computational sciences. First I will talk about "Pochoir" (pronounced "PO-shwar") - a stencil computation compiler for multicores developed at MIT CSAIL. Stencils have numerous applications in computational sciences including geophysics, fluid dynamics, finance, and computational biology. Next I will talk about "F2Dock" - a state-of-the-art rigid-body protein-protein docking software developed at UT Austin in collaboration with the SCRIPPS Research Institute. Docking algorithms have important applications in structure-based drug design, and in the study of molecular assemblies and protein-protein interactions.
  • Item
    Modeling Rich Structured Data via Kernel Distribution Embeddings
    (Georgia Institute of Technology, 2011-03-25) Song, Le
    Real world applications often produce a large volume of highly uncertain and complex data. Many of them have rich microscopic structures where each variable can take values on manifolds (e.g., camera rotations), combinatorial objects (e.g., texts, graphs of drug compounds) or high dimensional continuous domains (e.g., images and videos). Furthermore, these problems may possess additional macroscopic structures where the large collections of observed and hidden variables are connected by networks of conditional independence relations (e.g., in predicting depth from still images, and forecasting in time-series). Most previous learning algorithms for problems with such rich structures rely heavily on linear relations and parametric models where data are typically assumed to be multivariate Gaussian or discrete with a relatively small number of values. Conclusions inferred under these restricted assumptions can be misleading, if the underlying data generating processes contain nonlinear, non-discrete, or non-Gaussian components. How can we find a suitable representation for nonlinear and non-Gaussian relationships in a data-driven fashion? How can we exploit conditional independence structures between variables in rich structured setting? How can we design efficient algorithms to solve challenging nonparametric problems involving large amount of data? In this talk, I will introduce a nonparametric representation for distributions called kernel embeddings to address these questions. The key idea of the method is to map distributions to their expected features (potentially infinite dimensional), and given evidence, update these new representations solely in the feature space. Compared to existing nonparametric representations which are largely restricted to vectorial data and usually lead to intractable algorithms, very often kernel distribution embeddings lead to simpler, faster and more accurate algorithms in a diverse range of problems such as organizing photo albums, understanding social networks, retrieving documents across languages, predicting depth from still images and forecasting sensor time-series.
  • Item
    PHAST: Hardware-Accelerated Shortest Path Trees
    (Georgia Institute of Technology, 2011-02-25) Delling, Daniel
    We present a novel algorithm to solve the nonnegative single-source shortest path problem on road networks and other graphs with low highway dimension. After a quick preprocessing phase, we can compute all distances from a given source in the graph with essentially a linear sweep over all vertices. Because this sweep is independent of the source, we are able to reorder vertices in advance to exploit locality. Moreover, our algorithm takes advantage of features of modern CPU architectures, such as SSE and multi-core. Compared to Dijkstra's algorithm, our method needs fewer operations, has better locality, and is better able to exploit parallelism at multi-core and instruction levels. We gain additional speedup when implementing our algorithm on a GPU, where our algorithm is up to three orders of magnitude faster than Dijkstra's algorithm on a high-end CPU. This makes applications based on all-pairs shortest-paths practical for continental-sized road networks. Several algorithms, such as computing the graph diameter, exact arc flags, or centrality measures (exact reaches or betweenness), can be greatly accelerated by our method. Joint work with Andrew V. Goldberg, Andreas Nowatzyk, and Renato F. Werneck.
  • Item
    The Exascale: Why and How
    (Georgia Institute of Technology, 2011-02-11) Keyes, David
    Sustained floating-point computation rates on real applications, as tracked by the ACM Gordon Bell Prize, increased by three orders of magnitude from 1988 (1 Gigaflop/s) to 1998 (1 Teraflop/s), and by another three orders of magnitude to 2008 (1 Petaflop/s). Computer engineering provided only a couple of orders of magnitude of improvement for individual cores over that period; the remaining factor came from concurrency, which is approaching one million-fold. Algorithmic improvements contributed meanwhile to making each flop more valuable scientifically. As the semiconductor industry now slips relative to its own roadmap for silicon-based logic and memory, concurrency, especially on-chip many-core concurrency and GPGPU SIMD-type concurrency, will play an increasing role in the next few orders of magnitude, to arrive at the ambitious target of 1 Exaflop/s, extrapolated for 2018. An important question is whether today’s best algorithms are efficiently hosted on such hardware and how much co-design of algorithms and architecture will be required. From the applications perspective, we illustrate eight reasons why today’s computational scientists have an insatiable appetite for such performance: resolution, fidelity, dimension, artificial boundaries, parameter inversion, optimal control, uncertainty quantification, and the statistics of ensembles. The paths to the exascale summit are debated, but all are narrow and treacherous, constrained by fundamental laws of physics, cost, power consumption, programmability, and reliability. Drawing on recent reports, workshops, vendor projections, and experiences with scientific codes on contemporary platforms, we propose roles for today’s researchers in one of the great global scientific quests of the next decade.