Title:
Metagenomics analysis of disease-related human gut microbiota
Metagenomics analysis of disease-related human gut microbiota
Author(s)
Xu, Congmin
Advisor(s)
Qiu, Peng
Zhu, Huaiqiu
Zhu, Huaiqiu
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
The human gut microbiota have been linked with various pathological disorders. Yet, our understanding of the underlying mechanisms is still limited by the inconsistent results of different
publications and the inherent complexity. These separate studies and incomparable data sets
missed the forest for the trees, thus encouraging us to carry out meta-analysis of human gut
microbiome regarding different kinds of diseases and dip into the question about what kinds
of human gut microbial community are healthy.
1. This dissertation underpins the consistent discipline behind disease-related dysbiosis by
conducting a pan-microbiome analysis, which annotated and analyzed the microbiome contigs
and genes identified from raw reads of whole genome sequencing (WGS) data of human gut.
Consistent pattern shift was discovered in the microbial mutually dependent community, which
revealed that the microbial members in diseases are more competitive while less cooperative
than health, remarkably driven by the 20-times increase of competitive pairs between potential
pathogens and 10-times decrease of cooperative pairs between non pathogens. Additionally, taking all the microbiota in the same community as a ‘super organism’, our mathematical
model of gene-gene interaction network revealed the significance of cell motility, though it
was not a dominant functional category. This part of work answered the question about how
the ecological niches of gut modulate human health in a systematic matter.
2. This dissertation discovered some inflammation and cancer related genera increase in
the advanced aging individuals while some beneficial genera are lost, and proved the existence
of aging progression of human gut microbiota, by applying an unsupervised machine learning
algorithm to recapitulate the underlying aging progression of microbial community from hosts in different age groups. Aging process captures many facets of biological variation of the
human body, which leads to functional decline and increased incidence of infection in gut of
elderly people. Different from diseases, the aging transformation is a continuous progress. We
obtained raw 16S rRNA sequencing data of subjects ranging from newborns to centenarians from a previous study, and summarized the data into a relative abundance matrix of genera in all the samples. Without using the age information of samples, we applied multivariate
unsupervised analysis, which revealed the existence of a continuous aging progression of human gut microbiota along with the host aging process. The identified genera associated to this
aging process are meaningful for designing probiotics to maintain the gut microbiota to resemble a young age, which hopefully will lead to positive impact on human health, especially for
individuals in advanced age groups.
3. This dissertation develops a machine learning model LightCUD for disease discrimination based on human gut microbiome, which was designed for discriminating UC and CD
from non-IBD colitis. Using a set of WGS data from 349 human gut microbiota samples
with two types of IBD and healthy controls, we assembled and aligned WGS short reads to
obtain feature profiles. Owing to the well-designed feature selection and machine learning
algorithms comparison, LightCUD outperforms other pilot studies. LightCUD was implemented in Python and packaged free for installation with customized databases. With WGS
data or 16S rDNA sequencing data of gut microbiota samples as the input, LightCUD can
discriminate IBD from healthy controls with high accuracy and further identify the specific
type of IBD. The executable program LightCUD is released as open source at the webpage
http://cqb.pku.edu.cn/zhulab/lightcud/.
4. This dissertation constructed a comprehensive database, named DREEM, of DiseaseRElatEd Marker genes in human gut microbiome, which retrieves a large scale WGS data
released in GenBank and EMBL. Short reads with the size of 18.63T consisting of 1,729 samples are processed with unified procedure, involving the state-of-the-art bioinformatics tools
and well-designed statistical analysis, and covering six types of pathological conditions, i.e.,
T2D, Crohn’s diseases, ulcerative colitis, liver cirrhosis, symptomatic atherosclerosis and
obesity. Furthermore, the database annotates the disease-related marker genes functionally
and taxonomically. DREEM contains 1,953,046 disease-related marker genes and 5100 core
genes. The database is accessible at http://cqb.pku.edu.cn/ZhuLab/DREEM.
This dissertation conducted a pan-microbiome analysis integrating multiple diseases, revealed the aging progression of human gut microbiota, released the tool LightCUD for discriminating diseases based on human gut microbiome and constructed a disease-related marker gene
database within human gut microbiota.
Sponsor
Date Issued
2020-07-06
Extent
Resource Type
Text
Resource Subtype
Dissertation