Organizational Unit:
School of Biological Sciences

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 7 of 7
Thumbnail Image
Item

Population genomics of human polymorphic transposable elements

2016-11-15 , Rishishwar, Lavanya

Transposable element (TE) activity has had a major impact on the human genome; more than two-thirds of the sequence is derived from TE insertions. Several families of human TEs – primarily Alu, L1 and SVA – continue to actively transpose, thereby generating insertion polymorphisms between individuals. Until very recently, it has not been possible to characterize the genetic variation generated by the activity of these TE families at the scale of whole genomes for multiple individuals within and between human populations. For this reason, the impact of recent TE activity on human evolution has yet to be fully appreciated. My dissertation research leverages novel technologies in data science to investigate the role that recent TE activity has played in shaping human population genetic variation. Specifically, my dissertation addresses three problems: 1) evaluation of the computational techniques used to characterize human polymorphic TE insertion sites from whole genome, next-generation sequence data, 2) characterization of the population genomic variation of human polymorphic TEs and evaluation of their effectiveness as markers of human genetic ancestry and admixture, and 3) analysis of the effects that natural selection (negative and positive) has exerted on human polymorphic TE insertions. I close by presenting a broad prospectus on the implications of genome-scale analyses of human polymorphic TE insertions for population and clinical genetic studies. The results reported in this dissertation represent the dawn of the population genomics era for human TEs.

Thumbnail Image
Item

Alteration of transcription by non-coding elements in the human genome

2012-06-27 , Conley, Andrew Berton

The human genome contains ~1.5% coding sequence, with the remaining 98.5% being non-coding. The functional potential of the majority of this non-coding sequence remains unknown. Much of this non-coding sequence is derived from transposable element (TE) sequences. These TE sequences contain their own regulatory information, e.g. promoter and transcription factor binding sites. Given the large number of these sequences, over 4 million in the human genome, it would be expected that the regulatory information that they contain would affect the expression of nearby genes. This dissertation describes research that characterizes that alternation of and contribution to the human transcriptome by non-coding elements, including TE sequences.

Thumbnail Image
Item

Epigenetic regulation of the human genome by transposable elements

2010-07-07 , Huda, Ahsan

Nearly one half of the human genome is composed of transposable elements (TEs). Once dismissed as 'selfish' or 'junk' DNA, TEs have also been implicated in a numerous functions that serve the needs of their host genome. I have evaluated the role of TEs in mediating the epigenetic mechanisms that serve to regulate human gene expression. These findings can be broadly divided into two major mechanisms by which TEs affect human gene expression; by modulating nucleosome binding in the promoter regions and by recruiting epigenetic histone modifications that enable them to serve as promoters and enhancers. Thus. the studies encompassed in this thesis elucidate the contributions of TEs in epigenetically regulating human gene expression on a global as well as local scale.

Thumbnail Image
Item

Effects of repetitive DNA and epigenetics on human genome regulation

2013-07-02 , Jjingo, Daudi

The highly developed and specialized anatomical and physiological characteristics observed for eukaryotes in general and mammals in particular are underwritten by an elaborate and intricate process of genome regulation. This precise control of the location, timing and amplitude of gene expression is achieved by a variety of genetic and epigenetic tools and mechanisms. While several of these regulatory mechanisms have been extensively studied, our understanding of the complex and diverse associations between various epigenetic marks and genetic elements with genome regulatory systems has remained incomplete. However, the recent profound improvements in sequencing technologies have significantly improved the depth and breadth to which their functions and relationships can be understood. The objective of this thesis has been to apply bioinformatics, computational and statistical tools to analyze and interpret various recent high throughput datasets from a combination of Next generation sequencing and Chromatin immune precipitation (ChIP-seq) experiments. These datasets have been analyzed to further our understanding of the dynamics of gene regulation in humans, particularly as it relates to repetitive DNA, cis-regulation and DNA methylation. The thesis thus resides at the intersection of three major areas; transposable elements, cis-regulatory elements and epigenetics. It explores how those three aspects of regulation relate with gene expression and the functional implications of those interactions. From this analysis, the thesis provides new insights into; 1) the relationship between the transposable element environment of human genes and their expression, 2) the role of mammalian-wide interspersed repeats (MIRs) in the function of human enhancers and enhancement of tissue-specic functions, 3) the existence and function of composite cis-regulatory elements and 4) the dynamics and relationship between human gene-body DNA methylation and gene expression.

Thumbnail Image
Item

Computational tools for molecular epidemiology and computational genomics of Neisseria meningitidis

2010-11-17 , Katz, Lee Scott

Neisseria meningitidis is a gram negative, and sometimes encapsulated, diplococcus that causes devastating disease worldwide. For the worldwide genetic surveillance of N. meningitidis, the gold standard for profiling the bacterium uses genetic loci found around the genome. Unfortunately, the software for analyzing the data for these profiles is difficult to use for a variety of reasons. This thesis shows my suite of tools called the Meningococcus Genome Informatics Platform for the analysis of these profiling data. To better understand N. meningitidis, the CDC Meningitis Laboratory and other world class laboratories have adopted a whole genome approach. To facilitate this approach, I have developed a computational genomics assembly and annotation pipeline called the CG-Pipeline. It assembles a genome, predicts locations of various features, and then annotates those features. Next, I developed a comparative genomics browser and database called NBase. Using CG-Pipeline and NBase, I addressed two open questions in N. meningitidis research. First, there are N. meningitidis isolates that cause disease but many that do not cause disease. What is the genomic basis of disease associated versus asymptomatically carried isolates of N. meningitidis? Second, some isolates' capsule type cannot be easily determined. Since isolates are grouped into one of many serogroups based on this capsule, which aids in epidemiological studies and public health response to N. meningitidis, often an isolate cannot be grouped. Thus the question is what is the genomic basis of nongroupability? This thesis addresses both of these questions on a whole genome level.

Thumbnail Image
Item

Computational algorithm development for epigenomic analysis

2012-07-03 , Wang, Jianrong

Multiple computational algorithms were developed for analyzing ChIP-seq datasets of histone modifications. For basic ChIP-seq data processing, the problems of ambiguous short sequence read mapping and broad peak calling of diffuse ChIP-seq signals were solved by novel statistical methods. Their performance was systematically evaluated compared with existing approaches. The potential utility of finding meaningful biological information was demonstrated by the applications on real datasets. For biological question driven data mining, several important topics were selected for algorithm developments, including hypothesis-driven insulator prediction, unbiased chromatin boundary element discovery and combinatorial histone modification signature inference. The integrative computational pipeline for insulator prediction not only produced a list of putative insulators but also recovered specific associated chromatin and functional features. Selected predictions have been experimentally validated. The unbiased chromatin boundary element prediction algorithm was feature-free and had the capability to discover novel types of boundary elements. The predictions found a set of chromatin features and provided the first report of tRNA-derived boundary elements in the human genome. The combinatorial chromatin signature algorithm employed chromatin profile alignments for unsupervised inferences of histone modification patterns. The signatures were associated with various regulatory elements and functional activities. Both the computational advantages and the biological discoveries were discussed.

Thumbnail Image
Item

Algorithm development for next generation sequencing-based metagenome analysis

2010-08-26 , Kislyuk, Andrey O.

We present research on the design, development and application of algorithms for DNA sequence analysis, with a focus on environmental DNA (metagenomes). We present an overview and primer on algorithm development for bioinformatics of metagenomes; work on frameshift detection in DNA sequencing data; work on a computational pipeline for the assembly, feature prediction, annotation and analysis of bacterial genomes; work on unsupervised phylogenetic clustering of metagenomic fragments using Markov Chain Monte Carlo methods; and work on estimation of bacterial genome plasticity and diversity, potential improvements to the measures of core and pan-genomes.