Series
Doctor of Philosophy with a Major in Bioinformatics

Series Type
Degree Series
Description
Associated Organization(s)
Associated Organization(s)
Organizational Unit
School of Biological Sciences
School established in 2016 with the merger of the Schools of Applied Physiology and Biology
Organizational Unit
Wallace H. Coulter Department of Biomedical Engineering
The joint Georgia Tech and Emory department was established in 1997
Organizational Unit
Organizational Unit

Publication Search Results

Now showing 1 - 10 of 129
  • Item
    Chordate-specific gene regulatory network of neuron development in Ciona.
    (Georgia Institute of Technology, 2023-12-12) Kim, Kwantae
    In this research, I investigated the complex gene regulatory networks underlying neurogenesis, taking advantage of the unique neurons of the Ciona model system. I revealed that Fgf signaling is crucial for the neurogenesis of Bipolar Tail Neurons (BTNs) by controlling the expression of Neurogenin, the fate-determining transcription factor in these neurons. Then I also characterized multiple effector genes functioning in the development of BTNs. Additionally, I determined the vital role of the Pax3/7 transcription factor in the neural plate border to induce the neural tube closure. Finally, I demonstrated how the Pax3/7 also orchestrates an intricate gene regulatory network upstream of multiple transcription factors and functional effectors during the neurogenesis of Descending Decussating Neurons (ddNs). I found that the majority of this network’s regulatory branches are shared with other neurons in Ciona or even other organisms including vertebrates. Moreover, I revealed the role of key putative effector genes during the neurogenesis of ddNs. These findings will provide profound insights into developmental mechanisms in the central nervous system of chordates.
  • Item
    Evolution in real time: insights from micro- to macroscopic multicellular organisms
    (Georgia Institute of Technology, 2023-12-06) Pineau, Rozenn
    Multicellularity has evolved independently at least 50 times and fundamentally transformed life on Earth, yet basic questions remain about how this transition initially occurs and shapes ecological dynamics. Understanding this transition and its underlying mechanisms is essential to better understand the evolution of life on Earth. The first part of this work examines the emergence of a common yet understudied multicellular organism morphology, cuboidal packing. Spherical fission yeast (Schizosaccharomyces pombe) mutants were experimentally evolved via daily settling selection favoring larger size. Within 20 days, multicellular clusters evolved cuboidal cellular packing, a topology found across the tree of life. These clusters displayed traits of multicellular individuals: reproduction via cluster fracture, heritability in size, and response to group selection. Our genetic analysis reveals mutations in the ACE2 gene underlying this transition to multicellularity. This is an example of a deep convergent evolution, as this gene has also been implicated in the transition to multicellularity in Saccharomyces cerevisiae, a yeast species that diverged from S. pombe 300 millions of years ago. Next, we explore the ecological implications of the transition to multicellularity and show how the formation of groups itself is an opportunity for niche expansion and divergence. Using long-term experimental evolution of snowflake yeast (S. cerevisiae), we show that the fundamental trade-off between growth and survival facilitated the evolution of two distinct coexisting phenotypes: one Small phenotype specialized in growth, and one Large phenotype specialized in survival. Coexistence is maintained by negative frequency dependent selection, and sequencing reveals that the dominant lineages present after 715 daily transfers have coexisted throughout the duration of the experiment. This work demonstrates how a simple and yet fundamental trade-off between growth and survival can immediately drive adaptive diversification and maintain increased ecological diversity. Overall, this work provides experimental and theoretical insights into eco-evolutionary niche construction and long-term population dynamics following multicellularity's origins. Finally, the last part of this dissertation sheds light on mutation dynamics in complex and ancient multicellular organisms. We explore the evolutionary history of an ancient and still-living clonal forest, the Pando aspen clone. Harnessing the genetic signal generated by the accumulation of somatic mutations in the different tissues of the Pando clone, we detect spatial genetic structure, and estimate Pando's minimum age around 2,000 years. Together, this thesis uses experimental evolution in unicellular microbes and natural experiments in clonal macrobes, contributing fundamental knowledge to our understanding of multicellular evolution, from the initial emergence of multicellular groups to the formation of complex, ancient organisms.
  • Item
    Analysis and design of multi-modal clinical and genomic risk scores for disease prediction using machine learning
    (Georgia Institute of Technology, 2023-09-05) Isgut, Monica
    Polygenic risk scores (PRSs) are promising tools for leveraging genomic data for disease risk prediction in clinical settings. However, little is known about their value in the context of clinical data routinely available. This work aims to analyze the value-add of genomic data in multi-modality risk prediction models over models with clinical data alone, 1) for several diseases, 2) across disease subpopulation groups, and 3) across different categories of model complexity (i.e., logistic regression vs. neural networks) and clinical or genomic feature space. The latter more specifically evaluates: a) the effect of integrating large-scale clinical data derived from electronic health records (EHRs) with PRSs in a multi-modal neural network on the estimated value-add of the PRSs in the risk model, and b) the effect of integrating standard small-scale clinical risk factors (i.e., body mass index, smoking status) with genomic data in the form of individual genomic features (hereafter also denoted as a PRS) in a neural network on the estimated value-add of the genomic data. In addition to the systematic analysis of the factors contributing to the value-add of genomic data and the design of multi-modality genomic and clinical neural networks for disease prediction, this work also introduces two novel representation learning algorithms designed to derive low-dimensional representations of EHR diagnostic data and genotype data, respectively. Furthermore, this work explores various the use of neural network interpretability tools applied to multi-modality disease risk scores to gain insights into important or interacting features utilized in risk prediction.
  • Item
    Computational Analysis of Gene Expression in the Teleost Forebrain and the Cellular Basis of a Social Behavior
    (Georgia Institute of Technology, 2023-08-18) Gruenhagen, George Wolfgang
    Teleosts (ray-finned fish) are the largest vertebrate clade, comprising roughly half of all extant vertebrate species, and can perform complex behaviors requiring advanced cognition. A species of teleost fish, Mchenga conophoros (MC), performs a social behavior called bower-building, whereby males repetitively manipulate sand to form a structure called a bower, over which they court females and chase away competing males. Comparative genomic analysis has revealed that this social behavior performed by MC is associated with a region of high genomic divergence on linkage group 11. While the genetic basis of this behavior has been investigated, the brain regions and cell populations involved are unknown. Furthermore, the homology of brain regions in the teleost to the mammalian brain is unclear due to the unique folding of the teleost brain during development. This work aims to 1) identify the cellular basis of bower-building in MC and 2) uncover the relationships between cell-types and anatomical regions in the teleost brain to other vertebrates - amphibians, reptiles, birds and mammals. To address the first aim, we performed single nuclei RNA-sequencing (snRNA-seq) on 19 males actively performing bower-building and 19 control males that were not performing bower-building. This resulted in a total of 33,674 nuclei. I linked genes associated with the evolution of bower-building behavior to a subpopulation of quiescent stem-like cells. We found evidence that behavior-associated neural activity may result in a departure from quiescence and a differential supply of new neurons to a specific region in the teleost brain, the ventral subdivision of the dorsal lateral pallium (Dl-v). To determine the relationship of teleostean brain regions, such as the Dl-v, to other vertebrates, we performed spatial transcriptomics, which profiles gene expression within tissue architecture, unlike snRNA-seq. Together, with these complementary technologies, we created a spatially resolved atlas of gene expression in the MC forebrain and compared expression profiles of thousands of genes across vertebrates. I identified ancestral features of non-neuronal and neuronal populations in MC, including hippocampal and surprisingly neocortical populations. The presence of neocortical-like structures in non-mammals is widely debated. Here I find evidence of neocortical transcriptional signatures in the teleost granule zone of the dorsal lateral pallium (Dl-g). Additionally, I found conserved molecular features of the hippocampus in the teleost Dl-v. In summary, we identified forebrain populations involved in bower-building behavior and ascertained their evolutionary relationships to other vertebrates.
  • Item
    Expanding The Bioinformatics Toolbox for Diversity and Taxonomic Studies of Microbial Eukaryotic Pathogens
    (Georgia Institute of Technology, 2023-07-30) Seabolt, Matthew H.
    Cataloguing and studying microbial eukaryote diversity and speciation present unique challenges due to their evolutionary divergence from well-studied model genomes, limited culturing methods, and uncertain taxonomy. The scarcity of high-quality genomic data poses a significant obstacle to understanding genome relatedness and important traits like virulence and antimicrobial resistance, with much debate centered on how to reconcile discordant phylogenetic signals from existing molecular typing data with historical records and type specimens. Thus far, no major movement has occurred in almost two decades. Additionally, existing bioinformatics methods need advancement to handle large eukaryotic genomes effectively. This research aims to expand the set of available bioinformatics tools for the comparative analysis of genomes of microbial eukaryotes. Case studies using the protozoan parasite Giardia duodenalis as a model organism are presented. These studies include (i) developing a new, automated pipeline for identifying the best gene markers in the genome for phylogenetic reconstruction purposes and strain-level resolution, (ii) the creation of a statistical framework to identify cryptic species and quantify their evolutionary relationships, and (iii) improving reference genome annotation of the Giardia genome. Lastly, we employed the genome aggregate average nucleotide identity (ANI) and graph-based methods to assess whether or not natural boundaries between eukaryotic species exist, similar to those previously observed for Prokaryotes, and study the relationship between shared gene content and ANI (or degree of genetic relatedness). The findings suggest that sequence-discrete clusters of genomes, akin to traditional species, are prevalent among the examined genomes and our methodology is robust across eukaryote phyla and at multiple taxonomic hierarchies. Applying the conclusions from this research, such as 95% ANI as a general-purpose species boundary in eukaryotes as well as ANI’s utility for molecular typing, this research’s conclusions contribute novel biological insights and bioinformatics methods to the toolkit for eukaryote taxonomy, and genome analysis.
  • Item
    A holistic approach to improving ovarian cancer care
    (Georgia Institute of Technology, 2023-07-17) Ban, Dongjo
    Ovarian cancer (OC), often referred to as the "silent killer" due to its elusive early-stage symptoms and frequent late diagnoses, remains a significant public health challenge. The primary objective of this research is to navigate the intricate landscape of OC at the genomic and metabolomic levels using high-throughput technologies. This exploration strives to uncover potential strategies for early detection and treatment improvement, thereby addressing this persistent health concern. In the initial phase of the study, the genomic complexity of OC is unraveled through an analysis of the tumor mutation burden (TMB) and patterns of copy-number alterations (CNAs). The investigation reveals a higher TMB in localized tumors and cancer-related genes compared to non-cancer genes. We observed that impaired DNA-repair mechanisms play a pivotal role in elevating TMB levels. A notable finding is the differential selective pressure patterns, represented by dN/dS ratio estimates, between early- and late-stage OC. Further, the impact of CNAs on OC patients was analyzed, showing a prevalence of amplification events over deletion ones and a higher number of affected genes in the early-stage group. Although CNAs were not found to be higher in cancer-associated genes, the study identifies a preference for amplification in oncogenes and deletion in tumor suppressor genes upon investigating driver regions. The latter phase of the research emphasizes the role of metabolomics in detecting early-stage OC. Machine learning (ML) approaches were employed to examine high-throughput serum metabolomic profiles from OC patients and non-cancerous individuals from various geographical locations. The resulting classifiers exhibited promising predictive potential, thus emphasizing the utility of metabolomics for early OC detection. Particularly, the emergence of lipid or lipid-like molecules as potential markers underscores their significance in OC detection. Collectively, these findings accentuate the potential of an integrated approach in developing personalized cancer management strategies, taking into account the unique variations observed in patients. This paves the way for clinically identifying high-risk individuals for more frequent monitoring and tailoring appropriate treatment options for optimal patient outcomes. Given the growing volume of data and the continuous advancements in technology, such comprehensive approaches can augment survival rates and ameliorate the quality of life for OC patients.
  • Item
    Responses of African Mammals and Ecosystems to Environmental Change Across Space and Time
    (Georgia Institute of Technology, 2023-07-07) Lauer, Daniel Avery
    Africa is home to some of the most biodiverse mammalian assemblages on Earth, but their diversity is threatened by human activities and rapid climate change. If we are to conserve Africa’s mammals, we must understand how they and their surrounding ecosystems respond to changing environmental conditions across space and time. I explored these responses in this dissertation, as I investigated the mechanisms through which species, communities, and ecosystems persist or become imperiled in the face of environmental change. In Chapter 1, I examined how past biodiversity losses in herbivorous megafauna may have impacted the fundamental relationships between megafaunal functional traits and environmental conditions. I adapted traditional methods in the field of ecometrics to evaluate if trait-environment relationships were disrupted over the past 7.5 Ma (million years). I found that while biodiversity losses have occurred since 5 Ma, only those after 2 Ma coincided with such a disruption. Before 2 Ma, biodiversity losses resulted from megafaunal adaptations to expanding grasslands. After 2 Ma, conversely, losses occurred as landscapes became arid and mismatches arose between megafaunal traits and environmental conditions. Consequently, past environmental change-induced biodiversity losses may have varied in their impacts on megafauna. For Chapter 2, I moved forward to modern times and disentangled the theory that heterogeneous environments constrain species’ geographic range sizes. Specifically, I compared the influences of habitat heterogeneity (variation in habitat types across space) versus topographic heterogeneity (variation in physical elevations) on mammalian ranges. Using statistical models, I found that only the former constrains species ranges, while the latter has no influence whatsoever. Such a distinction adds nuance to prior ecological theory, and it suggests that we must conserve range-limited mammals in regions of high habitat heterogeneity. I remained in modern times for Chapter 3 but shifted my focus to the ecosystems in which mammals live. I investigated the under-explored idea that ecosystems exhibit a tradeoff between their ability to withstand disturbance events (resistance) and their capacity to recover from them (stability). Statistical models revealed that such a tradeoff exists across African protected areas, as the characteristics of more resistant ecosystems oppose those of more stable ecosystems. This tradeoff may therefore be a widespread phenomenon, and consequently, a balance between the two must be struck if ecosystems are to endure future disturbances. Finally, I looked to the future in Chapter 4, as I combined insights from the fields of ecometrics and landscape connectivity to inform future mammalian conservation efforts. Using ecometrics, I determined that trait-environment relationships will weaken in >90% of herbivorous megafaunal communities across Africa. Such communities may require changes in their species compositions if they are to maintain their ability to function. I therefore built landscape connectivity models to assess where landscapes may facilitate or impede future movements of species between communities. Based on model outcomes, I provided recommendations for where conservation efforts should protect species either in situ or by facilitating their movements. Overall, my dissertation introduces new perspectives and re-evaluates conventional wisdom to advance our understanding of mammalian responses to changing environments.
  • Item
    Genetic influences of fatty acid metabolism and ancestral origins on disease
    (Georgia Institute of Technology, 2023-05-23) Astore, Courtney Alexandra
    Although there have been extensive efforts to study the influence of environmental factors and their interactions with various diseases, the contribution of metabolomic imbalances in the development and pathogenesis of disease is still not fully understood. Furthermore, despite numerous attempts to replicate the findings of genome-wide association studies (GWAS) in diverse populations, little progress has been made in replicating the associations of rare variants with complex diseases. The objective of this thesis is to leverage a combination of biobank-level phenotype and genetic data to investigate the effects of metabolites as well as explore the ancestral origins of rare variants in inflammatory bowel disease (IBD) and in other disease areas. Thus, the three chapters of this thesis investigate the roles of two main areas in human diseases: (1) fatty-acid metabolism, and (2) admixture and rare variants. The first study investigates the causal association between circulating metabolites and IBD. This study leverages the use of Mendelian Randomization (MR), a method that uses significant genetic variants from the GWAS of the exposure trait as instrumental variables to assess the causal relationship between a modifiable exposure and an outcome. In this case, we assessed over 200 metabolites and evaluated their relationship to IBD via MR. Omega-3 fatty acids were found to be one of the most significant protective associations with IBD, which was replicated in three independent GWAS. The second study is an extension of the first study, which further evaluates the disease architectures of 3 polyunsaturated fatty acids (PUFAs), omega-3 fatty acids, omega-6 fatty acids, and docosahexaenoic acid (DHA). The objective of this study is to demonstrate integrative PheWAS approaches using the metabolite levels and their polygenic scores to assess the association between the three PUFAs and over 1,300 disease endpoints. Using the metabolite-disease associations with concordant significant evidence from both approaches, we applied MR to assess causality. Protective associations with concordant evidence from all three PUFAs were identified. The last study assesses the role of admixture on the rare variant contribution to IBD. In this chapter we investigated the impact of 25 rare, European ascertained, Crohn’s disease (CD) variants identified by Sazonovs et.al. on IBD in African American whole genome sequencing data. Our findings showed a consistent four-to-five-fold reduction in allele frequency in African Americans when compared to European. Further, phasing on the WGS data confirmed that the CD risk alleles discovered in Europeans contributes to the risk in African Americans due to admixture. Additionally, we found that 45 rare variants discovered by a meta-analysis of UK Biobank and FinnGen spanning ten disease classes from Sun et.al. are also mostly present due to admixture in the African American cohort. These results highlight the importance of conducting whole exome and genome sequencing studies on large, diverse cohorts to gain a better understanding on the role of rare variants in disease and promote equitable research practices.
  • Item
    Computational Models of Actin Regulation Driving Cytoskeletal Dynamics, Cell Polarity and Motion
    (Georgia Institute of Technology, 2023-04-27) Hladyshau, Siarhei
    Cell morphodynamics is a fundamental biological process required for the healthy functioning of a eukaryotic organism. Understanding its regulatory mechanisms is needed for developing new strategies to treat numerous diseases, including cancer metastasis, excessive angiogenesis, congenital disorders, and chronic wounds. My work focuses on Rho family GTPases (RhoA, Rac1, and Cdc42), known as the key regulators of actin cytoskeleton and cell motion. I developed a computational platform that allowed me to study different configurations of GTPase signaling pathways and capture the complex spatiotemporal distribution of these proteins driving cytoskeletal organization and dynamics. I applied this platform to investigate signaling bistability and the mechanisms of polarity establishment in yeast. I also used this methodology to study wave dynamics of GTPases and F-actin in the cortex of Patiria miniata and Xenopus laevis oocytes. I quantitatively reproduced different actin behaviors in these two organisms and revealed a critical role of quasi-static, low-amplitude patterns in the emergence of complex wave dynamics. Finally, I studied the regulation of cell ruffling by Cdc42 and Rac1 in epithelial breast cancer cells and mouse embryonic fibroblasts. Using my computational approach, I showed that cell edge velocity is regulated by the kinetic rate of GTPase activation rather than the concentration of the active molecules. My analysis also suggested that the timing of Rac1 and Cdc42 activity is cell-type dependent. I developed a model that reproduced such dependences and showed that feedback from Cdc42 and Rac1 was sufficient to control the activation delay when these GTPases have a common upstream regulatorily motif. I developed a series of image analysis pipelines for these studies that allowed precise tracking of GTPase activity and cell edge motion in simulations and experimental data.
  • Item
    Molecular mechanisms of microbial pathways for environmental contaminant remediation
    (Georgia Institute of Technology, 2023-01-13) Toporek, Yael Jordan
    This thesis examines the molecular mechanism of alternate strategies for remediation of contaminated environments. Radioiodine, perfluoroalkyl substances (PFAS), and 1,4-dioxane represent emerging contaminants of national concern. For example, microbially catalyzed reductive methylation of 129IO3- has received recent attention as an alternate strategy for remediation of radioiodine-contaminated environments. This thesis identified enzymes required for IO3- reduction coupled to organic acid oxidation in the facultative anaerobe Shewanella oneidensis: cytoplasmic electron donors are oxidized, and the electrons are transferred through the periplasm via cytochromes of the metal-reducing pathway to extracellular dimethylsulfoxide (DMSO) reductase, which directly reduces IO3- to iodide (I-) as an alternate substrate. Future work aims to investigate the apparent import of I- back to the cytoplasm, where it is putatively methylated and volatilized by a promiscuous thiopurine methyltransferase, presenting a potential strategy for bioremediation of radioiodine. In the case of PFAS, the industrial surfactant and flame retardant perfluorooctanoic acid (PFOA) has been designated as an emerging contaminant. In the present study, the microbially driven Fenton reaction (MFR) was employed to attempt degradation of PFOA by cycling between aerobic and anaerobic ferric iron (Fe(III))-reducing conditions. Under aerobic conditions, S. oneidensis reduced molecular oxygen (O2) to hydrogen peroxide (H2O2), while under anaerobic conditions, S. oneidensis reduced Fe(III) to Fe(II). During aerobic-to-anaerobic transition periods, Fe(II) and H2O2 interacted chemically via the Fenton reaction to produce contaminant-degrading hydroxyl (HO•) radicals, which in turn interacted with PFOA. PFOA concentrations, however, remained unchanged, which most likely reflects the stability of carbon-fluorine bonds and consequent inability of HO• radicals to oxidatively degrade PFOA. Finally, the present study aimed to determine the redox conditions of the intracellular environment during oxidative stress in S. oneidensis from aerobic respiration and H2O2 stress. In contrast to S. oneidensis anaerobic respiration, aerobic respiration is understudied, but is a key contributor to MFR in degrading organic and chlorinated environmental contaminants like 1,4-dioxane. This work describes the native and perturbed redox environment of the S. oneidensis cytoplasm, as well as the contribution of individual genes, particularly catalases and peroxidases, to intracellular H2O2 scavenging rates using the genetically-encoded ratiometric fluorescent sensor HyPer-3 as a reporter. As measured by HyPer-3, deletion of one or more catalases and peroxidases resulted in dramatic changes in the redox condition of the cytoplasm, while other H2O2-scavenging enzymes provided overlapping H2O2 scavenging activity to combat H2O2 challenges. Based on cytoplasmic HyPer-3 redox signals, results from the present study indicated that periplasmic PgpD, cytoplasmic KatB, and previously overlooked cytoplasmic KatG1 and KatG2 provide first- and second-line defenses to protect against exogenous H2O2 challenges in minimal growth medium.