Organizational Unit:
School of Computational Science and Engineering

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 10 of 10
  • Item
    Efficient parallel algorithms for error correction and transcriptome assembly of biological sequences
    (Georgia Institute of Technology, 2018-05-29) Sachdeva, Vipin
    Next-generation sequencing technologies have led to a big data age in biology. Since the sequencing of the human genome, the primary bottleneck has steadily moved from collection to storage and analysis of the data. The primary contributions of this dissertation are design and implementation of novel parallel algorithms for two important problems in bioinformatics – error-correction and transcriptome assembly. For error-correction, we focused on k-mer spectrum based error-correction application called Reptile. We designed a novel distributed memory algorithm that divided the k-mer and tiles amongst the processing ranks. This allows any hardware with any memory size per node to be employed for error-correction using Reptile’s algorithm, irrespective of the size of the dataset. Our implementational achieved highly scalable results for E.Coli, Drosophila as well as the human datasets which consisted of 1.55 billion reads. Besides an algorithm that distributes k-mers and tiles between ranks, we have also implemented numerous heuristics that are useful to adjust the algorithm based on the hardware traits. We also implemented an extension of our parallel algorithm further by using pre-generating tiles and using collective messages to reduce the number of point to point messages for error-correction. Further extensions of this work have focused to create a library for distributed k-mer processing which has applications to problems in metagenomics. For transcriptome assembly, we have implemented a hybrid MPI-OpenMP approach for Chrysalis, which is part of the Trinity pipeline. Chrysalis clusters minimally overlapping contigs obtained from the prior module in Trinity called Inchworm. With this parallelization, we were able to reduce the runtime of the Chrysalis step of the Trinity workflow from over 50 hours to less than 5 hours for the sugarbeet dataset. We also employed this implementation to complete transcriptome of a 1.5 billion reads dataset pooled from different bread wheat cultivars. Furthermore, we have also implemented a MapReduce based approach to clustering k-mers which has application to the parallelization of the Inchworm module of Trinity. This implementation is a significant step towards making de novo transcriptome assembly feasible for ever bigger transcriptome datasets.
  • Item
    A novel method for cluster analysis of RNA structural data
    (Georgia Institute of Technology, 2018-05-21) Rogers, Emily
    Functional RNA is known to contribute to a host of important biological pathways, with new discoveries being made daily. Because function is dependent on structure, computational tools that predict secondary structure of RNA are crucial to researchers. By far the most popular method is to predict the minimum free energy structure as the native. However, well-known limitations of this method have led the computational RNA community to move on to Boltzmann sampling. This method predicts an ensemble of structures sampled from the Boltzmann distribution under the Nearest Neighbor Thermodynamic Model (NNTM). Although providing a more thorough view of the folding landscape of a sequence, the Boltzmann sampling method also has the drawback of needing post-processing (i.e. data mining) in order to be meaningful. This dissertation presents a novel method of representing and clustering secondary structures of a Boltzmann sample. In addition, it demonstrates its ability to extract the meaningful structural signal of a Boltzmann sample by identifying significant commonalities and differences. Applications include two outstanding problems in the computational RNA community: investigating the ill-conditioning of thermodynamic optimization under the NNTM, and predicting a consensus structure for a set of sequences. Finally, this dissertation concludes with research performed as an intern for the Department of Defense's Defense Forensic Science Center. This work concerns analyzing the results of a DNA mixture interpretation study, highlighting the current state of forensic interpretation today.
  • Item
    Graph analysis of streaming relational data
    (Georgia Institute of Technology, 2018-04-13) Zakrzewska, Anita N.
    Graph analysis can be used to study streaming data from a variety of sources, such as social networks, financial transactions, and online communication. The analysis of streaming data poses many challenges, including dealing with the high volume of data and the speed with which it is generated. This dissertation addresses challenges that occur throughout the graph analysis process. Because many datasets are large and growing, it may be infeasible to collect and build a graph from all the data that has been generated. This work addresses the challenges created by large volumes of streaming data through new sampling techniques. The algorithms presented can sample a subgraph in a single pass over an edge stream and are therefore appropriate for streaming applications. A sampling algorithm that can produce a temporally biased subgraph is also presented. Before graph analysis techniques can be applied, a graph must first be created from the data collected. When creating dynamic graphs, it is not obvious how to de-emphasize old information, especially when edges are derived from interactions. This work evaluates several methods of aging old data to create dynamic graphs. This dissertation also contributes new techniques for dynamic community detection and analysis. A new algorithm for local community detection on dynamic graphs is presented. Because it incrementally updates results when the graph changes, the method is suitable for streaming data. The creation of dynamic graphs allows us to study community changes over time. This work addresses the topic of community analysis with a vertex-level measure of community change. Together, these contributions advance the study of streaming relational data through graph analysis.
  • Item
    Distributed memory building blocks for massive biological sequence analysis
    (Georgia Institute of Technology, 2018-04-03) Pan, Tony C.
    K-mer indices and de Bruijn graphs are important data structures in bioinformatics with multiple applications ranging from foundational tasks such as error correction, alignment, and genome assembly, to knowledge discovery tasks including repeat detection and SNP identification. While advances in next generation sequencing technologies have dramatically reduced the cost and improved latency and throughput, few bioinformatics tools can efficiently process the data sets at the current generation rate of 1.8 terabases every 3 days. The volume and velocity with which sequencing data is generated necessitate efficient algorithms and implementation of k-mer indices and de Bruijn graphs, two central components in bioinformatic applications. Existing applications that utilize k-mer counting and de Bruijn graphs, however, tend to provide embedded, specialized implementations. The research presented here represents efforts toward the creation of the first reusable, flexible, and extensible distributed memory parallel libraries for k-mer indexing and de Bruijn graphs. These libraries are intended for simplifying the development of bioinformatics applications for distributed memory environments. For each library, our goals are to create a set of API that are simple to use, and provide optimized implementations based on efficient parallel algorithms. We designed algorithms that minimize communication volume and latency, and developed implementations with better cache utilization and SIMD vectorization. We developed Kmerind, a k-mer counting and indexing library based on distributed memory hash table and distributed sorted arrays, that provide efficient insert, find, count, and erase operations. For de Bruijn graphs, we developed Bruno by leveraging Kmerind functionalities to support parallel de Bruijn graph construction, chain compaction, error removal, and graph traversal and element query. Our performance evaluations showed that Kmerind is scalable and high performance. Kmerind counted k-mers in a 120GB data set in less than 13 seconds on 1024 cores, and indexing the k-mer positions in 17 seconds. Using the Cori supercomputer and incorporating architecture aware optimizations as well as MPI-OpenMP hybrid computation and overlapped communication, Kmerind was able to count a 350GB data set in 4.1 seconds using 4096 cores. Kmerind has been shown to out-perform the state-of-the-art k-mer counting tools at 32 to 64 cores on a shared memory system. The Bruno library is built on Kmerind and implements efficient algorithms for construction, compaction, and error removal. It is capable of constructing, compacting,and generating unitigs for a 694GB human read data set in 7.3 seconds on 7680 Edison cores. It is 1.4X and 3.7X faster than its state-of-the-art alternatives in shared and distributed memory environments, respectively. Error removal in a graph constructed from an 162 GB data set completed in 13.1 and 3.91 seconds with frequency filter of 2 and 4 respectively on 16 nodes, totaling 512 cores. While our target domain is bioinformatics, we approached algorithm design and implementation with the aim for broader applicabilities in computer science and other application domains. As a result, our chain compaction and cycle detection algorithms can feasibly be applied to general graphs, and our distributed and sequential cache friendly hash tables as well as vectorized hash functions are generic and application neutral.
  • Item
    Numerical and streaming analyses of centrality measures on graphs
    (Georgia Institute of Technology, 2018-03-28) Nathan, Eisha
    Graph data represent information about entities (vertices) and the relationships or connections between them (edges). In real-world networks today, new data are constantly being produced, leading to the notion of dynamic graphs. When analyzing large graphs, a common problem of interest is to identify the most important vertices in a graph, which can be done using centrality metrics. This dissertation presents novel advances in the field of graph analysis by providing numerical and streaming techniques that help us better understand how to compute centrality measures. Several centrality measures are calculated by solving a linear system but since these linear systems are large, iterative solvers are often used as an alternate method to approximate the solution. We relate the two research areas of numerical accuracy and data mining by understanding how the error in a solver affects the relative ranking of vertices in a graph. To calculate the centrality values of vertices in a dynamic graph, the most naive method is to recompute the scores from scratch every time the graph is changed, but as the graph size grows larger this recomputation is computationally infeasible. We present four dynamic algorithms for updating different centrality metrics in evolving networks. All dynamic algorithms are faster than their static counterparts while maintaining good quality of the centrality scores. This dissertation concludes by applying methods discussed for the computation of centrality metrics to community detection, and we present a new algorithm for identifying local communities in a dynamic graph using personalized centrality.
  • Item
    High performance computing algorithms for discrete optimization
    (Georgia Institute of Technology, 2017-11-03) Munguia Conejero, Lluis-Miquel M.
    This thesis concerns the application of High Performance Computing to Discrete Optimization, and the development of massively parallel algorithms designed to accelerate the solving process of Mixed-Integer Programs (MIPs). We begin by presenting a portfolio of scalable parallel primal heuristics, which focus on providing the end-user with high quality feasible solutions to any MIP program quickly. In some cases, we show our algorithms to be several orders of magnitude more effective than current state-of-the-art approaches. The first of the contributions in this category is a specialized primal heuristic for the Fixed Charge Multicommodity Network Flow problem. The presented computational experiments back the superior effectiveness of our method at finding substantially better primal solutions when compared to state-of-the-art commercial MIP solvers, even when the latter are allowed five times as much time. We further generalize the introduced notions and develop Parallel Alternating Criteria Search: a general-purpose parallel primal method prepared for handling any unstructured MIP. We show how the combination of parallelism and simple large neighborhood search schemes can provide a powerful tool for generating high quality solutions for any given problem. Our parallel method is able to produce competitive or better and faster results for more than 90% of the tested instances against CPLEX. Parallel Alternating Criteria Search becomes especially useful in the context of large instances and time-sensitive optimization problems, where traditional branch-and-bound methods may not be able to provide competitive upper bounds and attaining feasibility may be challenging. The modular nature of Parallel Alternating Criteria Search can provide an excellent platform for rapid prototyping parallel domain-specific heuristics. We show this particular achievement on the Maritime Inventory Routing Problem, a complex optimization problem that combines network flows with inventory management. In the presented results, tailored versions of our parallel algorithm are able to significantly outperform domain-specific state-of-the-art heuristics and parallel MIP solvers alike. The second half of this thesis is dedicated to the introduction of PIPS-SBB, a parallel distributed-memory solver for deterministic equivalent two-stage stochastic MIPs (sMIPs). The newly introduced solver features multiple levels of nested parallelism. It is also designed with data parallelism in mind, allowing the problem data to be partitioned across multiple distributed-memory machines. We then present two Branch-and-Bound parallelizations extending the already parallel solver. We investigate the effects of leveraging multiple levels of parallelism and their part in improving the scaling performance beyond thousands of cores. We also compare our algorithms against a distributed-memory implementation of a commercial MIP solver. The latter proves to be the best performer at small problem scales. However, the specialized nature of the methods present in PIPS-SBB-based solvers allow them to be the best performers in large SMIP instances. The direct product of this thesis is a set of algorithms ready to be used in massively parallel systems to quickly find high quality solutions to any MIP problem. The presented works ultimately increase our understanding of the use of parallelism in the context of Discrete Optimization and its important part in improving the effectiveness and performance of its algorithms.
  • Item
    Agglomerative clustering for community detection in dynamic graphs
    (Georgia Institute of Technology, 2016-05-10) Godbole, Pushkar J.
    Agglomerative Clustering techniques work by recursively merging graph vertices into communities, to maximize a clustering quality metric. The metric of Modularity coined by Newman and Girvan, measures the cluster quality based on the premise that, a cluster has collections of vertices more strongly connected internally than would occur from random chance. Various fast and efficient algorithms for community detection based on modularity maximization have been developed for static graphs. However, since many (contemporary) networks are not static but rather evolve over time, the static approaches are rendered inappropriate for clustering of dynamic graphs. Modularity optimization in changing graphs is a relatively new field that entails the need to develop efficient algorithms for detection and maintenance of a community structure while minimizing the “Size of change” and computational effort. The objective of this work was to develop an efficient dynamic agglomerative clustering algorithm that attempts to maximize modularity while minimizing the “size of change” in the transitioning community structure. First we briefly discuss the previous memoryless dynamic reagglomeration approach with localized vertex freeing and illustrate its performance and limitations. Then we describe the new backtracking algorithm followed by its performance results and observations. In experimental analysis of both typical and pathological cases, we evaluate and justify various backtracking and agglomeration strategies in context of the graph structure and incoming stream topologies. Evaluation of the algorithm on social network datasets, including Facebook (SNAP) and PGP Giant Component networks shows significantly improved performance over its conventional static counterpart in terms of execution time, Modularity and Size of Change.
  • Item
    Graph analysis combining numerical, statistical, and streaming techniques
    (Georgia Institute of Technology, 2016-03-31) Fairbanks, James Paul
    Graph analysis uses graph data collected on a physical, biological, or social phenomena to shed light on the underlying dynamics and behavior of the agents in that system. Many fields contribute to this topic including graph theory, algorithms, statistics, machine learning, and linear algebra. This dissertation advances a novel framework for dynamic graph analysis that combines numerical, statistical, and streaming algorithms to provide deep understanding into evolving networks. For example, one can be interested in the changing influence structure over time. These disparate techniques each contribute a fragment to understanding the graph; however, their combination allows us to understand dynamic behavior and graph structure. Spectral partitioning methods rely on eigenvectors for solving data analysis problems such as clustering. Eigenvectors of large sparse systems must be approximated with iterative methods. This dissertation analyzes how data analysis accuracy depends on the numerical accuracy of the eigensolver. This leads to new bounds on the residual tolerance necessary to guarantee correct partitioning. We present a novel stopping criterion for spectral partitioning guaranteed to satisfy the Cheeger inequality along with an empirical study of the performance on real world networks such as web, social, and e-commerce networks. This work bridges the gap between numerical analysis and computational data analysis.
  • Item
    High performance computing for irregular algorithms and applications with an emphasis on big data analytics
    (Georgia Institute of Technology, 2014-03-31) Green, Oded
    Irregular algorithms such as graph algorithms, sorting, and sparse matrix multiplication, present numerous programming challenges, including scalability, load balancing, and efficient memory utilization. In this age of Big Data we face additional challenges since the data is often streaming at a high velocity and we wish to make near real-time decisions for real-world events. For instance, we may wish to track Twitter for the pandemic spread of a virus. Analyzing such data sets requires combing algorithmic optimizations and utilization of massively multithreaded architectures, accelerator such as GPUs, and distributed systems. My research focuses upon designing new analytics and algorithms for the continuous monitoring of dynamic social networks. Achieving high performance computing for irregular algorithms such as Social Network Analysis (SNA) is challenging as the instruction flow is highly data dependent and requires domain expertise. The rapid changes in the underlying network necessitates understanding real-world graph properties such as the small world property, shrinking network diameter, power law distribution of edges, and the rate at which updates occur. These properties, with respect to a given analytic, can help design load-balancing techniques, avoid wasteful (redundant) computations, and create streaming algorithms. In the course of my research I have considered several parallel programming paradigms for a wide range systems of multithreaded platforms: x86, NVIDIA's CUDA, Cray XMT2, SSE-SIMD, and Plurality's HyperCore. These unique programming models require examination of the parallel programming at multiple levels: algorithmic design, cache efficiency, fine-grain parallelism, memory bandwidths, data management, load balancing, scheduling, control flow models and more. This thesis deals with these issues and more.
  • Item
    Algorithm design on multicore processors for massive-data analysis
    (Georgia Institute of Technology, 2010-06-28) Agarwal, Virat
    Analyzing massive-data sets and streams is computationally very challenging. Data sets in systems biology, network analysis and security use network abstraction to construct large-scale graphs. Graph algorithms such as traversal and search are memory-intensive and typically require very little computation, with access patterns that are irregular and fine-grained. The increasing streaming data rates in various domains such as security, mining, and finance leaves algorithm designers with only a handful of clock cycles (with current general purpose computing technology) to process every incoming byte of data in-core at real-time. This along with increasing complexity of mining patterns and other analytics puts further pressure on already high computational requirement. Processing streaming data in finance comes with an additional constraint to process at low latency, that restricts the algorithm to use common techniques such as batching to obtain high throughput. The primary contributions of this dissertation are the design of novel parallel data analysis algorithms for graph traversal on large-scale graphs, pattern recognition and keyword scanning on massive streaming data, financial market data feed processing and analytics, and data transformation, that capture the machine-independent aspects, to guarantee portability with performance to future processors, with high performance implementations on multicore processors that embed processorspecific optimizations. Our breadth first search graph traversal algorithm demonstrates a capability to process massive graphs with billions of vertices and edges on commodity multicore processors at rates that are competitive with supercomputing results in the recent literature. We also present high performance scalable keyword scanning on streaming data using novel automata compression algorithm, a model of computation based on small software content addressable memories (CAMs) and a unique data layout that forces data re-use and minimizes memory traffic. Using a high-level algorithmic approach to process financial feeds we present a solution that decodes and normalizes option market data at rates an order of magnitude more than the current needs of the market, yet portable and flexible to other feeds in this domain. In this dissertation we discuss in detail algorithm design challenges to process massive-data and present solutions and techniques that we believe can be used and extended to solve future research problems in this domain.