Title:
Multidimensional statistics metric in biological data analysis

dc.contributor.advisor Dickson, Robert M.
dc.contributor.author Huang, Tzu-Hsueh
dc.contributor.committeeMember Perry, Joseph W.
dc.contributor.committeeMember Fahrni, Christoph J.
dc.contributor.committeeMember Curtis, Jennifer E.
dc.contributor.committeeMember Vannberg, Fredrik O.
dc.contributor.committeeMember Tzeng, Yih-Ling
dc.contributor.department Chemistry and Biochemistry
dc.date.accessioned 2018-08-20T15:31:17Z
dc.date.available 2018-08-20T15:31:17Z
dc.date.created 2017-08
dc.date.issued 2017-07-26
dc.date.submitted August 2017
dc.date.updated 2018-08-20T15:31:17Z
dc.description.abstract Sepsis, a serious body’s response to infection, is still a leading cause of death around the world. Appropriate treatment, however, can take more than three days for the clinical lab to determine. Board-spectrum antibiotics that might not be the effective treatment are thus issue before the lab results are available. Inappropriate antibiotic treatments not only increase the mortality rate but also trigger bacteria to acquire new resistant. This study focuses on identifying antibiotic resistant both phenotypically and genotypically. While phenotypic antibiotic susceptibility test (AST) is accurate, the standard AST test is very slow (three days). To rapidly determine the effective treatment, antibiotic-induced bacterial damages were monitored by flow cytometry. Probability binning – signature quadratic form (PB-sQF) is developed to analyze the cytometric data. PB-sQF adaptively bins the cytometric data so much fewer bins can be used than regular histograms. With PB-sQF, linear distances between data sets are calculated. As a result, data taken from different settings, machines or days can be compared directly. With only one hour of bacteria-antibiotic incubation, effective treatment can be selected by PB-sQF. This method reduces the time to result from 48 hours to 4 hours post-blood culture. For pre-blood culture test, the bacterial count ranges from 1 to 100 CFU/mL in the present of ~4x109 cells/mL of blood cells. To separate bacteria from blood cells, saponin was used to selectively lysed the blood cells. The isolated bacteria can then be incubated in the appropriate culture medium. With only 5 hours incubation (compare to 24 hours blood culture), effective treatments can be selected by analyzing the cytometric data with PB-sQF. This pre-blood culture fast AST (FAST), can be done in 8 hours instead of more than three days as in the standard AST. Although genotypic tests can only detect known antibiotic mechanisms, it can be done much faster than the traditional AST. While the existence of resistant gene is an important indicator for multidrug-resistant bacteria, the number of copy of a certain resistant gene is also a deterministic factor for their resistant phenotype. To estimate the copy number and determine whether there are copy number variations between the query sequence and the reference sequence, sequence analysis methods are developed. First, nearest-neighbor (NN) is used to map the short reads from the next-generation sequencer to the reference sequence. NN results are linear with the number of repeated regions in the reference sequence and NN is error-tolerant compared to mrFAST, BWA-MEM and Bowtie2. We then developed copy number variation detection with mapping multiplicity (CNVMM) to analyze the mapping results from NN. While all the CNV detectors cannot properly account of the multiple copies in the reference sequence, CNVMM adjusts for the repeated regions in the reference sequence by estimating the number of copies in the reference from the NN mapping results. We demonstrate that NN-CNVMM has better performance than mrFAST-mrCaNaVaR and MAQ-CNVnator. And using NN-CNVMM with short reads data of a multidrug-resistant Acinetobacter clinical isolate, we found that a carbapenem-resistant related gene has 10-fold higher copies in the clinical isolate than in the sensitive reference strain.
dc.description.degree Ph.D.
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/60172
dc.language.iso en_US
dc.publisher Georgia Institute of Technology
dc.subject Statistics
dc.subject Antibiotic susceptibility
dc.subject Antibiotic resistance
dc.subject Bacteria
dc.subject Sepsis
dc.subject Flow cytometry
dc.subject Next-generation sequencing
dc.subject Genome
dc.subject Copy number variation
dc.subject CNVs
dc.title Multidimensional statistics metric in biological data analysis
dc.type Text
dc.type.genre Dissertation
dspace.entity.type Publication
local.contributor.advisor Dickson, Robert M.
local.contributor.corporatename School of Chemistry and Biochemistry
local.contributor.corporatename College of Sciences
relation.isAdvisorOfPublication 328b7195-8f0f-4be1-a8e9-90ba4c20fb59
relation.isOrgUnitOfPublication f1725b93-3ab8-4c47-a4c3-3596c03d6f1e
relation.isOrgUnitOfPublication 85042be6-2d68-4e07-b384-e1f908fae48a
thesis.degree.level Doctoral
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
HUANG-DISSERTATION-2017.pdf
Size:
17.73 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
3.87 KB
Format:
Plain Text
Description: