Organizational Unit:
School of Computational Science and Engineering

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 10 of 104
  • Item
    Brownian dynamics studies of DNA internal motions
    (Georgia Institute of Technology, 2018-12-04) Ma, Benson Jer-Tsung
    Earlier studies by Chow and Skolnick suggest that the internal motions of bacterial DNA may be governed by strong forces arising from being crowded into the small space of the nucleoid, and that these internal motions affect the diffusion of intranuclear protein through the dense matrix of the nucleoid. These findings open new questions regarding the biological consequences of DNA internal motions, and the ability of internal motions to influence protein diffusion in response to different environment factors. The results of diffusion studies of DNA based on coarse-grained simulations are presented. Here, our goals are to investigate the internal motions of DNA with respect to external factors, namely salt concentration of the solvent and intranuclear protein size, and to understand the mechanisms by which proteins dif- fuse through the dense matrix of bacterial DNA. First, a novel coarse-grained model of the DNA chain was developed and shown to maintain the fractal property of in vivo DNA. Next, diffusion studies using this model were performed through Brownian dynamics simulations. Our results suggest that DNA internal motions may be substantially affected by ion concentrations near physiological ion concentration ranges, with the diffusion activity increasing to a limit with increases in ion concentration. Furthermore, it was found that, for a fixed protein volume fraction, the motions of proteins in a DNA-protein system are substantially affected by the size of the proteins, with the diffusion activity increasing to a limit with decreasing protein radii, but the internal motions of DNA within the same system do not appear to change with changes to protein sizes.
  • Item
    Scalable tensor decompositions in high performance computing environments
    (Georgia Institute of Technology, 2018-07-31) Li, Jiajia
    This dissertation presents novel algorithmic techniques and data structures to help build scalable tensor decompositions on a variety of high-performance computing (HPC) platforms, including multicore CPUs, graphics co-processors (GPUs), and Intel Xeon Phi processors. A tensor may be regarded as a multiway array, generalizing matrices to more than two dimensions. When used to represent multifactor data, tensor methods can help analysts discover latent structure; this capability has found numerous applications in data modeling and mining in such domains as healthcare analytics, social networks analytics, computer vision, signal processing, and neuroscience, to name a few. When attempting to implement tensor algorithms efficiently on HPC platforms, there are several obstacles: the curse of dimensionality, mode orientation, tensor transformation, irregularity, and arbitrary tensor dimensions (or orders). These challenges result in non-trivial computational and storage overheads. This dissertation considers these challenges in the specific context of the two of the most popular tensor decompositions, the CANDECOMP/PARAFAC (CP) and Tucker decompositions, which are, roughly speaking, the tensor analogues to low-rank approximations in standard linear algebra. Within that context, two of the critical computational bottlenecks are the operations known as Tensor-Times-Matrix (TTM) and Matricized Tensor Times Khatri-Rao Product (MTTKRP). We consider these operations in cases when the tensor is dense or sparse. Our contributions include: 1) applying memoization to overcome the curse of dimensionality challenge that exists in a sequence of tensor operations; 2) addressing the challenge of mode orientation through a novel tensor format HICOO and proposing a parallel scheduler to avoid the locks for write-conflict memory; 3) carrying out TTM and MTTKRP operations in-place, for dense and sparse cases, to avoid tensor-matrix conversions; 4) employing different optimization and parameter tuning techniques for CPU and GPU implementations to conquer the challenges of the irregularity and arbitrary tensor orders. To validate these ideas, we have implemented them in three prototype libraries, named AdaTM, InTensLi, and ParTI!, for arbitrary-order tensors. AdaTM is a model-driven framework to generate an adaptive tensor memoization algorithm with the optimal parameters for sparse CP decomposition. InTensLi produces fast single-node implementations of dense TTM of an arbitrary dimension. ParTI! is short for a Parallel Tensor Infrastructure which is written in C, OpenMP, MPI, and NVIDIA CUDA for sparse tensors and supports MATLAB interfaces for application-level users.
  • Item
    Tackling chronic diseases via computational phenotyping: Algorithms, tools and applications
    (Georgia Institute of Technology, 2018-07-31) Chen, Robert
    With the recent tsunami of medical data from electronic health records (EHRs), there has been a rise in interest in leveraging such data to improve efficiency of healthcare delivery and improve clinical outcomes. A large part of medical data science involves computational phenotyping, which leverages data driven methods to subtype and characterize patient conditions from heterogeneous EHR data. While many applications have used supervised phenotyping, unsupervised phenotyping will become increasingly more important in future precision medicine initiatives. A typical healthcare analytics workflow consists of phenotype discovery from EHR data, followed by predictive modeling that may leverage such phenotypes, followed by model deployment via avenues such as FHIR. To address unmet clinical needs, we have developed and demonstrated algorithms, tools and applications along each step of this process.
  • Item
    Cost benefit analysis of adding technologies to commercial aircraft to increase the survivability against surface to air threats
    (Georgia Institute of Technology, 2018-07-27) Patterson, Anthony
    Flying internationally is an integral part of people's everyday lives. Most United States airlines fly internationally on a daily basis. The world continues to become a more dangerous place, due to improvements to technology and the willingness of some nations to sell older technology to rebel groups. In the military realm, there have been countermeasures to combat surface to air threats and thus increase the survivability of military aircraft. Survivability is defined as the ability to remain mission capable after a single engagement. Existing commercial aircraft currently do not have any countermeasure systems or missile warning systems integrated into their onboard systems. Better understanding of the interaction between countermeasure systems and commercial aircraft will help bring additional knowledge to support a cost benefit analysis. The scope of this research is to perform a cost benefit analysis on the addition of these technologies that are currently available on military aircraft, and to study the adding of these same technologies to commercial aircraft. The research will include a cost benefit analysis along with a size, weight, and power analysis. Additionally, a simulation will be included that will analyze the success rates of different countermeasures versus different surface to air threats in hopes of bridging the gap between a cost benefit analysis and a survivability simulation. The research will explore whether or not adding countermeasure systems to commercial aircraft is technically feasible and economically viable.
  • Item
    Learning over functions, distributions and dynamics via stochastic optimization
    (Georgia Institute of Technology, 2018-07-27) Dai, Bo
    Machine learning has recently witnessed revolutionary success in a wide spectrum of domains. The learning objectives, model representation, and learning algorithms are important components of machine learning methods. To construct successful machine learning methods that are naturally fit to different problems with different targets and inputs, one should consider these three components together in a principled way. This dissertation aims for developing a unified learning framework for such purpose. The heart of this framework is the optimization with the integral operator in infinite-dimensional spaces. Such an integral operator representation view in the proposed framework provides us an abstract tool to consider these three components together for plenty of machine learning tasks and will lead to efficient algorithms equipped with flexible representations achieving better approximation ability, scalability, and statistical properties. We mainly investigate several motivated machine learning problems, i.e., kernel methods, Bayesian inference, invariance learning, policy evaluation and policy optimization in reinforcement learning, as the special cases of the proposed framework with different instantiations of the integral operator in the framework. These instantiations result in the learning problems with inputs as functions, distributions, and dynamics. The corresponding algorithms are derived to handle the particular integral operators via efficient and provable stochastic approximation by exploiting the particular structure properties in the operators. The proposed framework and the derived algorithms are deeply rooted in functional analysis, stochastic optimization, nonparametric method, and Monte Carlo approximation, and contributed to several sub-fields in machine learning community, including kernel methods, Bayesian inference, and reinforcement learning. We believe the proposed framework is a valuable tool for developing machine learning methods in a principled way and can be potentially applied to many other scenarios.
  • Item
    Efficient parallel algorithms for error correction and transcriptome assembly of biological sequences
    (Georgia Institute of Technology, 2018-05-29) Sachdeva, Vipin
    Next-generation sequencing technologies have led to a big data age in biology. Since the sequencing of the human genome, the primary bottleneck has steadily moved from collection to storage and analysis of the data. The primary contributions of this dissertation are design and implementation of novel parallel algorithms for two important problems in bioinformatics – error-correction and transcriptome assembly. For error-correction, we focused on k-mer spectrum based error-correction application called Reptile. We designed a novel distributed memory algorithm that divided the k-mer and tiles amongst the processing ranks. This allows any hardware with any memory size per node to be employed for error-correction using Reptile’s algorithm, irrespective of the size of the dataset. Our implementational achieved highly scalable results for E.Coli, Drosophila as well as the human datasets which consisted of 1.55 billion reads. Besides an algorithm that distributes k-mers and tiles between ranks, we have also implemented numerous heuristics that are useful to adjust the algorithm based on the hardware traits. We also implemented an extension of our parallel algorithm further by using pre-generating tiles and using collective messages to reduce the number of point to point messages for error-correction. Further extensions of this work have focused to create a library for distributed k-mer processing which has applications to problems in metagenomics. For transcriptome assembly, we have implemented a hybrid MPI-OpenMP approach for Chrysalis, which is part of the Trinity pipeline. Chrysalis clusters minimally overlapping contigs obtained from the prior module in Trinity called Inchworm. With this parallelization, we were able to reduce the runtime of the Chrysalis step of the Trinity workflow from over 50 hours to less than 5 hours for the sugarbeet dataset. We also employed this implementation to complete transcriptome of a 1.5 billion reads dataset pooled from different bread wheat cultivars. Furthermore, we have also implemented a MapReduce based approach to clustering k-mers which has application to the parallelization of the Inchworm module of Trinity. This implementation is a significant step towards making de novo transcriptome assembly feasible for ever bigger transcriptome datasets.
  • Item
    Doctor AI: Interpretable deep learning for modeling electronic health records
    (Georgia Institute of Technology, 2018-05-23) Choi, Edward
    Deep learning recently has been showing superior performance in complex domains such as computer vision, audio processing and natural language processing compared to traditional statistical methods. Naturally, deep learning techniques, combined with large electronic health records (EHR) data generated from healthcare organizations have potential to bring dramatic changes to the healthcare industry. However, typical deep learning models can be seen as highly expressive blackboxes, making them difficult to be adopted in real-world healthcare applications due to lack of interpretability. In order for deep learning methods to be readily adopted by real-world clinical practices, they must be interpretable without sacrificing their prediction accuracy. In this thesis, we propose interpretable and accurate deep learning methods for modeling EHR, specifically focusing on longitudinal EHR data. We will be- gin with a direct application of a well-known deep learning algorithm, recurrent neural networks (RNN), to capture the temporal nature of longitudinal EHR. Then, based on the initial approach we develop interpretable deep learning models by focusing on three aspects of computational healthcare: efficient representation learning of medical concepts, code-level interpretation for sequence predictions, and leveraging domain knowledge into the model. Another important aspect that we will address in this thesis is developing a framework for effectively utilizing multiple data sources (e.g. diagnoses, medications, procedures), which can be extended in the future to incorporate wider data modalities such as lab values and clinical notes.
  • Item
    Scalable and resilient sparse linear solvers
    (Georgia Institute of Technology, 2018-05-22) Sao, Piyush kumar
    Solving a large and sparse system of linear equations is a ubiquitous problem in scientific computing. The challenges in scaling such solvers on current and future parallel computer systems are the high-cost of communication and the expected decrease in reliability of the hardware components. This dissertation contributes new techniques to address these issues. Regarding communication, we make two advances to reduce both on-node and inter-node communication of distributed memory sparse direct solvers. On-node, we propose a novel technique, called the HALO, targeted at heterogeneous architectures consisting of multicore and hardware accelerator such as GPU or Xeon-Phi. The name HALO is a shorthand for highly asynchronous lazy offload, which refers to the way the method combines highly aggressive use of asynchrony with the accelerated offload, lazy updates, and data shadowing (a la Halo or ghost zones), all of which serve to hide and reduce communication, whether to local memory, across the network, or over PCIe. The overall hybrid solver achieves speed-up of up-to 3x on a variety of realistic test problems in single and multi-node configurations. To reduce inter-node communication, we present a novel communication-avoiding 3D sparse LU factorization algorithm. The 3D sparse LU factorization algorithm uses a three-dimensional logical arrangement of MPI processes and combines the data redundancy with the so-called elimination tree parallelism to reduce the communication. The 3D algorithm reduces the asymptotic communication costs by a factor of $O(\sqrt(log n))$ and latency costs by a factor of $O(log n)$ for planar sparse matrices arising from finite element discretization of two-dimensional PDEs. For the non-planar sparse matrices, it reduces the communication and latency costs by a constant factor. Beyond performance, we consider methods to improve solver resilience. In emerging and future systems with billions of computing elements, hardware faults during the execution may become a norm rather than an exception. We illustrate the principle of self-stabilization for constructing fault-tolerant iterative linear solvers. We give two proof-of-concept examples of self-stabilizing iterative linear solvers: one for steepest descent (SD) and one for conjugate gradients (CG). Our self-stabilized versions of SD and CG require small amounts of fault-detection, e.g., we may check only for NaNs and infinities. We test our approach experimentally by analyzing its convergence and overhead for different types and rates of faults.
  • Item
    A novel method for cluster analysis of RNA structural data
    (Georgia Institute of Technology, 2018-05-21) Rogers, Emily
    Functional RNA is known to contribute to a host of important biological pathways, with new discoveries being made daily. Because function is dependent on structure, computational tools that predict secondary structure of RNA are crucial to researchers. By far the most popular method is to predict the minimum free energy structure as the native. However, well-known limitations of this method have led the computational RNA community to move on to Boltzmann sampling. This method predicts an ensemble of structures sampled from the Boltzmann distribution under the Nearest Neighbor Thermodynamic Model (NNTM). Although providing a more thorough view of the folding landscape of a sequence, the Boltzmann sampling method also has the drawback of needing post-processing (i.e. data mining) in order to be meaningful. This dissertation presents a novel method of representing and clustering secondary structures of a Boltzmann sample. In addition, it demonstrates its ability to extract the meaningful structural signal of a Boltzmann sample by identifying significant commonalities and differences. Applications include two outstanding problems in the computational RNA community: investigating the ill-conditioning of thermodynamic optimization under the NNTM, and predicting a consensus structure for a set of sequences. Finally, this dissertation concludes with research performed as an intern for the Department of Defense's Defense Forensic Science Center. This work concerns analyzing the results of a DNA mixture interpretation study, highlighting the current state of forensic interpretation today.
  • Item
    Optimizing computational kernels in quantum chemistry
    (Georgia Institute of Technology, 2018-05-01) Schieber, Matthew Cole
    Density fitting is a rank reduction technique popularly used in quantum chemistry in order to reduce the computational cost of evaluating, transforming, and processing the 4-center electron repulsion integrals (ERIs). By utilizing the resolution of the identity technique, density fitting reduces the 4-center ERIs into a 3-center form. Doing so not only alleviates the high storage cost of the ERIs, but it also reduces the computational cost of operations involving them. Still, these operations can remain as computational bottlenecks which commonly plague quantum chemistry procedures. The goal of this thesis is to investigate various optimizations for density-fitted version of computational kernels used ubiquitously throughout quantum chemistry. First, we detail the spatial sparsity available to the 3-center integrals and the application of such sparsity to various operations, including integral computation, metric contractions, and integral transformations. Next, we investigate sparse memory layouts and their implication on the performance of the integral transformation kernel. Next, we analyze two transformation algorithms and how their performance will vary depending on the context in which they are used. Then, we propose two sparse memory layouts and the resulting performance of Coulomb and exchange evaluations. Since the memory required for these tensors grows rapidly, we frame these discussions in the context of their in-core and disk performance. We implement these methods in the P SI 4 electronic structure package and reveal the optimal algorithm for the kernel should vary depending on whether a disk-based implementation must be used.