Organizational Unit:

School of Biological Sciences

Permanent Link

https://hdl.handle.net/1853/70750

Parent Organization

Organizational Unit

College of Sciences

Includes Organization(s)

Organizational Unit

Center for the Study of Systems Biology

ArchiveSpace Name Record

https://finding-aids.library.gatech.edu/agents/corporate_entities/1131

Full item page

Publication Search Results

Now showing 1 - 1 of 1

Efficient alignment-free software applications for next generation sequencing-based molecular epidemiology

(Georgia Institute of Technology, 2020-01-09) Espitia Navarro, Hector Fabio

Public health agencies increasingly couple next generation sequencing (NGS) characterization of microbial genomes with bioinformatics analysis methods for molecular epidemiology. The overhead associated with the bioinformatics methods used for this purpose, in terms of both the required human expertise and computational resources, represents a critical bottleneck that limits the potential impact of microbial genomics on public health. This is particularly true for local public health agency laboratories, which are typically staffed with microbiologists who may not have substantial bioinformatics expertise or ready access to high-performance computational resources. There is a pressing need for bioinformatics solutions to genome-enabled molecular epidemiology that is accurate, easy to use, fast, and computationally efficient. This thesis research is focused on the development of an alignment-free algorithm for NGS data analysis and its implementation into turn-key software applications tailored explicitly for genome-enabled molecular epidemiology and environmental microbial genomics. I explored a computational strategy based on k-mer frequencies to distinguish among sequences of interest in NGS read samples. By combining this strategy with the efficient data structure Enhanced Suffix Array (ESA), I developed a base algorithm for the rapid analysis of unprocessed NGS reads. I further adapted and implemented this algorithm into a suite of software applications for sequence typing, gene detection, and gene-based taxonomic read classification. Benchmarking and validation analyses showed that STing is an ultrafast and accurate solution for genome-enabled molecular epidemiology, which performs better than existing bioinformatics methods for sequence typing and gene detection. To overcome the limitation of bioinformatics infrastructure and expertise in public health laboratories, I developed WebSTing, a Web-platform that uses the STing algorithm to provide easy access to the accurate and rapid alignment-free automated characterization of whole genome sequencing (WGS) samples of bacterial isolates. Finally, to demonstrate the utility of the STing in problems beyond simple sequence typing and gene detection, I applied the alignment-free algorithm to two different areas: (1) public health, with the virulence gene profiling of Shiga toxin-producing Escherichia coli (STEC) isolates, and (2) environmental microbial genomics, with the nifH gene-based taxonomy classification of amplicon sequencing reads. I showed that STing performs better that the gold-standard method for STEC isolate characterization, and that it correctly classifies amplicon sequencing reads on simulated communities of nitrogen-fixing organisms.

Organizational Unit:

School of Biological Sciences

Permanent Link

Research Organization Registry ID

Description

Previous Names

Parent Organization

Parent Organization

Includes Organization(s)

ArchiveSpace Name Record

Filters

Author

Advisor

Date

Organization

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Georgia Tech Library

Organizational Unit: School of Biological Sciences

Permanent Link

Research Organization Registry ID

Description

Previous Names

Parent Organization

Parent Organization

Includes Organization(s)

ArchiveSpace Name Record

Filters

Author

Advisor

Date

Organization

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Organizational Unit:

School of Biological Sciences