Title:
Genetic epidemiology algorithms for tracking drug resistance variants and genomic clustering of plasmodium species

dc.contributor.advisor Vannberg, Fredrik O.
dc.contributor.author Ravishankar, Shashidhar
dc.contributor.committeeMember Jordan, I. King
dc.contributor.committeeMember Udhayakumar, Venkatachalam
dc.contributor.committeeMember Voit, Eberhard
dc.contributor.committeeMember McDonald, John
dc.contributor.department Biology
dc.date.accessioned 2020-01-14T14:45:29Z
dc.date.available 2020-01-14T14:45:29Z
dc.date.created 2019-12
dc.date.issued 2019-09-03
dc.date.submitted December 2019
dc.date.updated 2020-01-14T14:45:29Z
dc.description.abstract The goal of this thesis is to develop algorithms for the analysis of P. falciparum, P. brasilianum, and P. malariae. Malaria is endemic in many parts of the world, including regions of central Africa, South America, and South East Asia. There are five known species that cause malaria in humans: P. falciparum, P. vivax, P. malariae, P. ovale, and P. knowlesi. P. knowlesi is a zoonotic parasite restricted to mostly South East Asia.According to a World Health Organization (WHO) report from 2018, these five species were responsible for nearly 219 million infections, resulting in an estimated 435,000 deaths related to malaria in 2017. In this work, we highlight algorithms that can identify the similarity between Plasmodium species and detect drug-resistant P. falciparum parasites. The two specific aims in this work, describe two novel algorithms for genomic clustering and molecular surveillance from Next-generation Sequencing (NGS) data. First, we describe a consensus-based variant identification framework molecular surveillance of drug resistance in infectious disease. We highlight its utility by identifying mutations associated with drug resistance in malaria isolates. The scalability of the framework is highlighted by analyzing 8351 M. tuberculosis isolates for the genotypic prediction of drug resistance. In the second aim, we describe a k-mer based alignment-free algorithm for the estimation of similarity between isolates from raw NGS data. Using a weighted Jaccard distance, we describe an exact method for estimation of the distance between isolates from k-mer count data. The memory efficiency, scalability, and accuracy of the algorithm was demonstrated using in-silico datasets generated from genomes of 12 Plasmodium species, as well as real-world isolates from an outbreak of C. auris in Colombia. The improved accuracy and scalability offered by the methods described in this work can facilitate the use of NGS in public health.
dc.description.degree Ph.D.
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/62281
dc.language.iso en_US
dc.publisher Georgia Institute of Technology
dc.subject Bioinformatics
dc.subject Variant calling
dc.subject Consensus variant calling
dc.subject Genomic clustering
dc.subject Alignment free algorithms
dc.subject k-mer based
dc.subject Molecular surveillance
dc.subject Malaria
dc.subject Anti-malarial drug resistance
dc.title Genetic epidemiology algorithms for tracking drug resistance variants and genomic clustering of plasmodium species
dc.type Text
dc.type.genre Dissertation
dspace.entity.type Publication
local.contributor.corporatename College of Sciences
local.contributor.corporatename School of Biological Sciences
relation.isOrgUnitOfPublication 85042be6-2d68-4e07-b384-e1f908fae48a
relation.isOrgUnitOfPublication c8b3bd08-9989-40d3-afe3-e0ad8d5c72b5
thesis.degree.level Doctoral
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
RAVISHANKAR-DISSERTATION-2019.pdf
Size:
2.67 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
3.87 KB
Format:
Plain Text
Description: