Organizational Unit:
School of Computational Science and Engineering

Research Organization Registry ID
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 10 of 35
  • Item
    The Exascale: Why and How
    (Georgia Institute of Technology, 2011-02-11) Keyes, David ; King Abdullah University of Science and Technology ; Columbia University ; Georgia Institute of Technology. School of Computational Science and Engineering
    Sustained floating-point computation rates on real applications, as tracked by the ACM Gordon Bell Prize, increased by three orders of magnitude from 1988 (1 Gigaflop/s) to 1998 (1 Teraflop/s), and by another three orders of magnitude to 2008 (1 Petaflop/s). Computer engineering provided only a couple of orders of magnitude of improvement for individual cores over that period; the remaining factor came from concurrency, which is approaching one million-fold. Algorithmic improvements contributed meanwhile to making each flop more valuable scientifically. As the semiconductor industry now slips relative to its own roadmap for silicon-based logic and memory, concurrency, especially on-chip many-core concurrency and GPGPU SIMD-type concurrency, will play an increasing role in the next few orders of magnitude, to arrive at the ambitious target of 1 Exaflop/s, extrapolated for 2018. An important question is whether today’s best algorithms are efficiently hosted on such hardware and how much co-design of algorithms and architecture will be required. From the applications perspective, we illustrate eight reasons why today’s computational scientists have an insatiable appetite for such performance: resolution, fidelity, dimension, artificial boundaries, parameter inversion, optimal control, uncertainty quantification, and the statistics of ensembles. The paths to the exascale summit are debated, but all are narrow and treacherous, constrained by fundamental laws of physics, cost, power consumption, programmability, and reliability. Drawing on recent reports, workshops, vendor projections, and experiences with scientific codes on contemporary platforms, we propose roles for today’s researchers in one of the great global scientific quests of the next decade.
  • Item
    Mining Billion-Node Graphs: Patterns, Generators, and Tools
    (Georgia Institute of Technology, 2011-04-08) Faloutsos, Christos ; Carnegie-Mellon University. School of Computer Science
    What do graphs look like? How do they evolve over time? How to handle a graph with a billion nodes? We present a comprehensive list of static and temporal laws, and some recent observations on real graphs (like, e.g., “eigenSpokes”). For generators, we describe some recent ones, which naturally match all of the known properties of real graphs. Finally, for tools, we present “oddball” for discovering anomalies and patterns, as well as an overview of the PEGASUS system which is designed for handling Billion-node graphs, running on top of the “hadoop” system.
  • Item
    Discovery of Mechanisms from Mathematical Modeling of DNA Microarray Data: Computational Prediction and Experimental Verification
    (Georgia Institute of Technology, 2010-02-16) Alter, Orly ; University of Texas at Austin. Institute for Cellular and Molecular Biology ; University of Texas at Austin. Dept. of Biomedical Engineering
    Future discovery and control in biology and medicine will come from the mathematical modeling of large-scale molecular biological data, such as DNA microarray data, just as Kepler discovered the laws of planetary motion by using mathematics to describe trends in astronomical data [1]. In this talk, I will demonstrate that mathematical modeling of DNA microarray data can be used to correctly predict previously unknown mechanisms that govern the activities of DNA and RNA. First, I will describe the computational prediction of a mechanism of regulation, by developing generalizations of the matrix and tensor computations that underlie theoretical physics and using them to uncover a genome-wide pattern of correlation between DNA replication initiation and RNA expression during the cell cycle [2,3]. Second, I will describe the recent experimental verification of this computational prediction, by analyzing global expression in synchronized cultures of yeast under conditions that prevent DNA replication initiation without delaying cell cycle progression [4]. Third, I will describe the use of the singular value decomposition to uncover "asymmetric Hermite functions," a generalization of the eigenfunctions of the quantum harmonic oscillator, in genome-wide mRNA lengths distribution data [5]. These patterns might be explained by a previously undiscovered asymmetry in RNA gel electrophoresis band broadening and hint at two competing evolutionary forces that determine the lengths of gene transcripts. Finally, I will describe ongoing work in the development of tensor algebra algorithms (as well as viusal correlation tools), the integrative and comparative modeling of DNA microarray data (as well as rRNA sequence data), and the discovery of mechanisms that regulate cell division, cancer and evolution. 1. Alter, PNAS 103, 16063 (2006). 2. Alter & Golub, PNAS 101, 16577 (2004). 3. Omberg, Golub & Alter, PNAS 104, 18371 (2007). 4. Omberg, Meyerson, Kobayashi, Drury, Diffley & Alter, Nature MSB 5, 312 (2009). 5. Alter & Golub, PNAS 103, 11828 (2006).
  • Item
    Automating Topology Aware Task Mapping on Large Supercomputers
    (Georgia Institute of Technology, 2010-03-30) Bhatele, Abhinav S. ; University of Illinois at Urbana-Champaign. Parallel Programming Laboratory
    Parallel computing is entering the era of petascale machines. This era brings enormous computing power to us and new challenges to harness this power efficiently. Machines with hundreds of thousands of processors already exist, connected by complex interconnect topologies. Network contention is becoming an increasingly important factor affecting overall performance. The farther different messages travel on the network, greater is the chance of resource sharing between messages and hence, of contention. Recent studies on IBM Blue Gene and Cray XT machines have shown that under contention, message latencies can be severely affected. Mapping of communicating tasks on nearby processors can minimize contention and lead to better application performance. In this talk, I will propose algorithms and techniques for automatic mapping of parallel applications to relieve the application developers of this burden. I will first demonstrate the effect of contention on message latencies and use these studies to guide the design of mapping algorithms. I will introduce the hop-bytes metric for the evaluation of mapping algorithms and suggest that it is a better metric than the previously used maximum dilation metric. I will then discuss in some detail, the mapping framework which comprises of topology aware mapping algorithms for parallel applications with regular and irregular communication patterns.
  • Item
    Open challenges in shape and animation processing
    (Georgia Institute of Technology, 2009-08-28) Rossignac, Jarek ; Georgia Institute of Technology. School of Interactive Computing
    Jarek Rossignac (IC, will present an overview of his recent research activities (with collaborators and students) and open challenges in shape and animation processing. These include: - SOT: Compact representation of tetrahedral meshes - J-splines: C^4 subdivision curves, surfaces, and animation - SAM: Steady interpolating affine motion - OCTOR: Exceptions in steady patterns - Pearling: Realtime segmentation of tubular structures in images and 3D medical scans - Surgem: Heart surgery planning and optimization based on blood flow simulation - APL: Aquatic Propulsion Lab, tools for designing and simulating swimming strategies - Ball map: Tangent-ball correspondence and compatibility between pairs of shapes - Ball-morph: Interpolation and applications to entertainment and medical surface reconstruction.
  • Item
    Sequences of Problems, Matrices, and Solutions
    (Georgia Institute of Technology, 2010-11-12) De Sturler, Eric ; Virginia Polytechnic Institute and State University. Dept. of Mathematics
    In a wide range of applications, we deal with long sequences of slowly changing matrices or large collections of related matrices and corresponding linear algebra problems. Such applications range from the optimal design of structures to acoustics and other parameterized systems, to inverse and parameter estimation problems in tomography and systems biology, to parameterization problems in computer graphics, and to the electronic structure of condensed matter. In many cases, we can reduce the total runtime significantly by taking into account how the problem changes and recycling judiciously selected results from previous computations. In this presentation, I will focus on solving linear systems, which is often the basis of other algorithms. I will introduce the basics of linear solvers and discuss relevant theory for the fast solution of sequences or collections of linear systems. I will demonstrate the results on several applications and discuss future research directions.
  • Item
    How much (execution) time and energy does my algorithm cost?
    (Georgia Institute of Technology, 2012-08-24) Vuduc, Richard ; Georgia Institute of Technology. School of Computational Science and Engineering ; Georgia Institute of Technology. College of Computing
    When designing an algorithm or performance-tuning code, is time-efficiency (e.g., operations per second) the same as energy-efficiency (e.g., operations per Joule)? Why or why not? To answer these questions, we posit a simple strawman model of the energy to execute an algorithm. Our model is the energy-based analogue of the time-based "roofline" model of Williams, Patterson, and Waterman (Comm. ACM, 2009). What do these models imply for algorithm design? What might computer architects tell algorithm designers to help them better understand whether and how algorithm design should change in an energy-constrained computing environment?
  • Item
    Cyber Games
    (Georgia Institute of Technology, 2013-02-19) Vorobeychik, Yevgeniy ; Georgia Institute of Technology. School of Computational Science and Engineering ; Sandia National Laboratories
    Over the last few years I have been working on game theoretic models of security, with a particular emphasis on issues salient in cyber security. In this talk I will give an overview of some of this work. I will first spend some time motivating game theoretic treatment of problems relating to cyber and describe some important modeling considerations. In the remainder, I will describe two game theoretic models (one very briefly), and associated solution techniques and analyses. The first is the "optimal attack plan interdiction" problem. In this model, we view a threat formally as a sophisticated planning agent, aiming to achieve a set of goals given some specific initial capabilities and considering a space of possible "attack actions/vectors" that may (or may not) be used towards the desired ends. The defender's goal in this setting is to "interdict" a select subset of attack vectors by optimally choosing among miti-gation options, in order to prevent the attacker from being able to achieve its goals. I will describe the formal model, explain why it is challenging, and present highly scalable decomposition-based integer programming techniques that leverage extensive research into heuristic formal planning in AI. The second model addresses the problem that defense decisions are typically decentralized. I describe a model to study the impact of decentralization, and show that there is a "sweet spot": for an intermediate number of decision makers, the joint decision is nearly socially optimal, and has the additional benefit of being robust to the changes in the environment. Finally, I will describe the Secure Design Competition (FIREAXE) that involved two teams of interns during the summer of 2012. The problem that the teams were tasked with was to design a highly stylized version of an electronic voting system. The catch was that after the design phase, each team would attempt to "attack" the other's design. I will describe some salient aspects of the specification, as well as the outcome of this competition and lessons that we (the designers and the students) learned in the process.
  • Item
    Efficient High-Order Discontinuous Galerkin Methods for Fluid Flow Simulations
    (Georgia Institute of Technology, 2010-02-22) Shahbazi, Khosro ; Brown University. Division of Applied Mathematics
  • Item
    Load-Balanced Bonded Force Calculations on Anton
    (Georgia Institute of Technology, 2010-03-15) Franchetti, Franz ; Carnegie-Mellon University. Dept. of Electrical and Computer Engineering
    Spiral ( is a program and hardware design generation system for linear transforms such as the discrete Fourier transform, discrete cosine transforms, filters, and others. We are currently extending Spiral beyond its original problem domain, using coding algorithms (Viterbi decoding and JPEG 2000 encoding) and image formation synthetic aperture radar, SAR) as examples. For a user-selected problem specification, Spiral autonomously generates different algorithms, represented in a declarative form as mathematical formulas, and their implementations to find the best match to the given target platform. Besides the search, Spiral performs deterministic optimizations on the formula level, effectively restructuringthe code in ways unpractical at the code or design level. Spiral generates specialized single-size implementations or adaptive general-size autotuning libraries, and utilizes special instructions and multiple processor cores. The implementation generated by Spiral rival the performance of expertly hand-tuned libraries. In this talk, we give a short overview on Spiral. We explain how Spiral generates efficient programs for parallel platforms including vector architectures, shared and distributed memory platforms, and GPUs; as well as hardware designs (Verilog) and automatically partitioned software/hardware implementations. We overview how Spiral targets the Cell BE and PowerXCell 8i, the BlueGene/P PPC450d processors, as well as Intel's upcoming Larrabee GPU and AVX vector instruction set. As all optimizations in Spiral, parallelization and partitioning are performed on a high abstraction level of algorithm representation, using rewriting systems.