Series
Computational Science and Engineering Seminar Series
Computational Science and Engineering Seminar Series
Permanent Link
Series Type
Event Series
Description
Associated Organization(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
35 results
Publication Search Results
Now showing
1  10 of 35

ItemThe Exascale: Why and How(Georgia Institute of Technology, 20110211) Keyes, David ; King Abdullah University of Science and Technology ; Columbia University ; Georgia Institute of Technology. School of Computational Science and EngineeringSustained floatingpoint computation rates on real applications, as tracked by the ACM Gordon Bell Prize, increased by three orders of magnitude from 1988 (1 Gigaflop/s) to 1998 (1 Teraflop/s), and by another three orders of magnitude to 2008 (1 Petaflop/s). Computer engineering provided only a couple of orders of magnitude of improvement for individual cores over that period; the remaining factor came from concurrency, which is approaching one millionfold. Algorithmic improvements contributed meanwhile to making each flop more valuable scientifically. As the semiconductor industry now slips relative to its own roadmap for siliconbased logic and memory, concurrency, especially onchip manycore concurrency and GPGPU SIMDtype concurrency, will play an increasing role in the next few orders of magnitude, to arrive at the ambitious target of 1 Exaflop/s, extrapolated for 2018. An important question is whether today’s best algorithms are efficiently hosted on such hardware and how much codesign of algorithms and architecture will be required. From the applications perspective, we illustrate eight reasons why today’s computational scientists have an insatiable appetite for such performance: resolution, fidelity, dimension, artificial boundaries, parameter inversion, optimal control, uncertainty quantification, and the statistics of ensembles. The paths to the exascale summit are debated, but all are narrow and treacherous, constrained by fundamental laws of physics, cost, power consumption, programmability, and reliability. Drawing on recent reports, workshops, vendor projections, and experiences with scientific codes on contemporary platforms, we propose roles for today’s researchers in one of the great global scientific quests of the next decade.

ItemMining BillionNode Graphs: Patterns, Generators, and Tools(Georgia Institute of Technology, 20110408) Faloutsos, Christos ; CarnegieMellon University. School of Computer ScienceWhat do graphs look like? How do they evolve over time? How to handle a graph with a billion nodes? We present a comprehensive list of static and temporal laws, and some recent observations on real graphs (like, e.g., “eigenSpokes”). For generators, we describe some recent ones, which naturally match all of the known properties of real graphs. Finally, for tools, we present “oddball” for discovering anomalies and patterns, as well as an overview of the PEGASUS system which is designed for handling Billionnode graphs, running on top of the “hadoop” system.

ItemDiscovery of Mechanisms from Mathematical Modeling of DNA Microarray Data: Computational Prediction and Experimental Verification(Georgia Institute of Technology, 20100216) Alter, Orly ; University of Texas at Austin. Institute for Cellular and Molecular Biology ; University of Texas at Austin. Dept. of Biomedical EngineeringFuture discovery and control in biology and medicine will come from the mathematical modeling of largescale molecular biological data, such as DNA microarray data, just as Kepler discovered the laws of planetary motion by using mathematics to describe trends in astronomical data [1]. In this talk, I will demonstrate that mathematical modeling of DNA microarray data can be used to correctly predict previously unknown mechanisms that govern the activities of DNA and RNA. First, I will describe the computational prediction of a mechanism of regulation, by developing generalizations of the matrix and tensor computations that underlie theoretical physics and using them to uncover a genomewide pattern of correlation between DNA replication initiation and RNA expression during the cell cycle [2,3]. Second, I will describe the recent experimental verification of this computational prediction, by analyzing global expression in synchronized cultures of yeast under conditions that prevent DNA replication initiation without delaying cell cycle progression [4]. Third, I will describe the use of the singular value decomposition to uncover "asymmetric Hermite functions," a generalization of the eigenfunctions of the quantum harmonic oscillator, in genomewide mRNA lengths distribution data [5]. These patterns might be explained by a previously undiscovered asymmetry in RNA gel electrophoresis band broadening and hint at two competing evolutionary forces that determine the lengths of gene transcripts. Finally, I will describe ongoing work in the development of tensor algebra algorithms (as well as viusal correlation tools), the integrative and comparative modeling of DNA microarray data (as well as rRNA sequence data), and the discovery of mechanisms that regulate cell division, cancer and evolution. 1. Alter, PNAS 103, 16063 (2006). 2. Alter & Golub, PNAS 101, 16577 (2004). 3. Omberg, Golub & Alter, PNAS 104, 18371 (2007). 4. Omberg, Meyerson, Kobayashi, Drury, Diffley & Alter, Nature MSB 5, 312 (2009). 5. Alter & Golub, PNAS 103, 11828 (2006).

ItemAutomating Topology Aware Task Mapping on Large Supercomputers(Georgia Institute of Technology, 20100330) Bhatele, Abhinav S. ; University of Illinois at UrbanaChampaign. Parallel Programming LaboratoryParallel computing is entering the era of petascale machines. This era brings enormous computing power to us and new challenges to harness this power efficiently. Machines with hundreds of thousands of processors already exist, connected by complex interconnect topologies. Network contention is becoming an increasingly important factor affecting overall performance. The farther different messages travel on the network, greater is the chance of resource sharing between messages and hence, of contention. Recent studies on IBM Blue Gene and Cray XT machines have shown that under contention, message latencies can be severely affected. Mapping of communicating tasks on nearby processors can minimize contention and lead to better application performance. In this talk, I will propose algorithms and techniques for automatic mapping of parallel applications to relieve the application developers of this burden. I will first demonstrate the effect of contention on message latencies and use these studies to guide the design of mapping algorithms. I will introduce the hopbytes metric for the evaluation of mapping algorithms and suggest that it is a better metric than the previously used maximum dilation metric. I will then discuss in some detail, the mapping framework which comprises of topology aware mapping algorithms for parallel applications with regular and irregular communication patterns.

ItemOpen challenges in shape and animation processing(Georgia Institute of Technology, 20090828) Rossignac, Jarek ; Georgia Institute of Technology. School of Interactive ComputingJarek Rossignac (IC, http://www.gvu.gatech.edu/~jarek/) will present an overview of his recent research activities (with collaborators and students) and open challenges in shape and animation processing. These include:  SOT: Compact representation of tetrahedral meshes  Jsplines: C^4 subdivision curves, surfaces, and animation  SAM: Steady interpolating affine motion  OCTOR: Exceptions in steady patterns  Pearling: Realtime segmentation of tubular structures in images and 3D medical scans  Surgem: Heart surgery planning and optimization based on blood flow simulation  APL: Aquatic Propulsion Lab, tools for designing and simulating swimming strategies  Ball map: Tangentball correspondence and compatibility between pairs of shapes  Ballmorph: Interpolation and applications to entertainment and medical surface reconstruction.

ItemSequences of Problems, Matrices, and Solutions(Georgia Institute of Technology, 20101112) De Sturler, Eric ; Virginia Polytechnic Institute and State University. Dept. of MathematicsIn a wide range of applications, we deal with long sequences of slowly changing matrices or large collections of related matrices and corresponding linear algebra problems. Such applications range from the optimal design of structures to acoustics and other parameterized systems, to inverse and parameter estimation problems in tomography and systems biology, to parameterization problems in computer graphics, and to the electronic structure of condensed matter. In many cases, we can reduce the total runtime significantly by taking into account how the problem changes and recycling judiciously selected results from previous computations. In this presentation, I will focus on solving linear systems, which is often the basis of other algorithms. I will introduce the basics of linear solvers and discuss relevant theory for the fast solution of sequences or collections of linear systems. I will demonstrate the results on several applications and discuss future research directions.

ItemHow much (execution) time and energy does my algorithm cost?(Georgia Institute of Technology, 20120824) Vuduc, Richard ; Georgia Institute of Technology. School of Computational Science and Engineering ; Georgia Institute of Technology. College of ComputingWhen designing an algorithm or performancetuning code, is timeefficiency (e.g., operations per second) the same as energyefficiency (e.g., operations per Joule)? Why or why not? To answer these questions, we posit a simple strawman model of the energy to execute an algorithm. Our model is the energybased analogue of the timebased "roofline" model of Williams, Patterson, and Waterman (Comm. ACM, 2009). What do these models imply for algorithm design? What might computer architects tell algorithm designers to help them better understand whether and how algorithm design should change in an energyconstrained computing environment?

ItemCyber Games(Georgia Institute of Technology, 20130219) Vorobeychik, Yevgeniy ; Georgia Institute of Technology. School of Computational Science and Engineering ; Sandia National LaboratoriesOver the last few years I have been working on game theoretic models of security, with a particular emphasis on issues salient in cyber security. In this talk I will give an overview of some of this work. I will first spend some time motivating game theoretic treatment of problems relating to cyber and describe some important modeling considerations. In the remainder, I will describe two game theoretic models (one very briefly), and associated solution techniques and analyses. The first is the "optimal attack plan interdiction" problem. In this model, we view a threat formally as a sophisticated planning agent, aiming to achieve a set of goals given some specific initial capabilities and considering a space of possible "attack actions/vectors" that may (or may not) be used towards the desired ends. The defender's goal in this setting is to "interdict" a select subset of attack vectors by optimally choosing among mitigation options, in order to prevent the attacker from being able to achieve its goals. I will describe the formal model, explain why it is challenging, and present highly scalable decompositionbased integer programming techniques that leverage extensive research into heuristic formal planning in AI. The second model addresses the problem that defense decisions are typically decentralized. I describe a model to study the impact of decentralization, and show that there is a "sweet spot": for an intermediate number of decision makers, the joint decision is nearly socially optimal, and has the additional benefit of being robust to the changes in the environment. Finally, I will describe the Secure Design Competition (FIREAXE) that involved two teams of interns during the summer of 2012. The problem that the teams were tasked with was to design a highly stylized version of an electronic voting system. The catch was that after the design phase, each team would attempt to "attack" the other's design. I will describe some salient aspects of the specification, as well as the outcome of this competition and lessons that we (the designers and the students) learned in the process.

ItemEfficient HighOrder Discontinuous Galerkin Methods for Fluid Flow Simulations(Georgia Institute of Technology, 20100222) Shahbazi, Khosro ; Brown University. Division of Applied Mathematics

ItemLoadBalanced Bonded Force Calculations on Anton(Georgia Institute of Technology, 20100315) Franchetti, Franz ; CarnegieMellon University. Dept. of Electrical and Computer EngineeringSpiral (www.spiral.net) is a program and hardware design generation system for linear transforms such as the discrete Fourier transform, discrete cosine transforms, filters, and others. We are currently extending Spiral beyond its original problem domain, using coding algorithms (Viterbi decoding and JPEG 2000 encoding) and image formation synthetic aperture radar, SAR) as examples. For a userselected problem specification, Spiral autonomously generates different algorithms, represented in a declarative form as mathematical formulas, and their implementations to find the best match to the given target platform. Besides the search, Spiral performs deterministic optimizations on the formula level, effectively restructuringthe code in ways unpractical at the code or design level. Spiral generates specialized singlesize implementations or adaptive generalsize autotuning libraries, and utilizes special instructions and multiple processor cores. The implementation generated by Spiral rival the performance of expertly handtuned libraries. In this talk, we give a short overview on Spiral. We explain how Spiral generates efficient programs for parallel platforms including vector architectures, shared and distributed memory platforms, and GPUs; as well as hardware designs (Verilog) and automatically partitioned software/hardware implementations. We overview how Spiral targets the Cell BE and PowerXCell 8i, the BlueGene/P PPC450d processors, as well as Intel's upcoming Larrabee GPU and AVX vector instruction set. As all optimizations in Spiral, parallelization and partitioning are performed on a high abstraction level of algorithm representation, using rewriting systems.