Organizational Unit:
Center for the Study of Systems Biology

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 10 of 25
Thumbnail Image
Item

Interactome networks

2007-11-27 , Vidal, Marc

For over half a century it has been conjectured that macromolecules form complex networks of functionally interacting components, and that the molecular mechanisms underlying most biological processes correspond to particular steady states adopted by such cellular networks. However, until recently, systems-level theoretical conjectures remained largely unappreciated, mainly because of lack of supporting experimental data. To generate the information necessary to eventually address how complex cellular networks relate to biology, we initiated, at the scale of the whole proteome, an integrated approach for modeling protein-protein interaction or "interactome" networks. Our main questions are: How are interactome networks organized at the scale of the whole cell? How can we uncover local and global features underlying this organization, and how are interactome networks modified in human disease, such as cancer?

Thumbnail Image
Item

From Systems Biology to Systems Analytics: Seeing More by Looking at Less

2007-10-09 , Mizaikoff, Boris

Systematic analysis of interactions between molecules and biological entities requires the development and application of experimental tools and analytical methods to quantitatively measure and image molecular events, molecular pathways, and molecular signals at the level of individual cells, ensembles of small biological entities and entire organisms with the required molecular selectivity, sensitivity, and temporal/spatial resolution. While it is evident that current analytical techniques are frequently limited to averaged measurements or ex-situ analysis, the analytical challenges for in-situ multi-parametric characterization of living biological entities such as cells, microbes, bacteria or ensembles thereof remain significant. Hence, in analogy and complementary to Systems Biology concerned with deciphering complex molecular processes and their relation to biological functionalities, we view Systems Analytics as the toolbox enabling the quantitative determination of multiple molecular parameters to elucidate these interactions and relations. From the analytic chemistry point of view, we may describe individual cells as a measurement compartment with spatial/volume dimensions in the μm-nm/μL-nL range, and quantitative molecular dimensions in the mM-nM domain. The spatial dimensionality of molecular events within or at cellular compartments (e.g. vesicular processes) or at the cell surface (e.g. exo-or endocytosis) along with the magnitude of the local species concentration determine the need for quantitative analytical measurements at the micro- and nanoscale. We will discuss the diversity of measurement challenges at these compartments, which include the small dimensions of the involved samples and volumes, the complex and frequently changing background matrix, the sensitivity and/or discriminatory power of in-situ analytical techniques, and their temporal and/or spatial resolution to quantitatively monitor dynamic processes associated with cellular functions. In turn, individual optical/spectroscopic, electrochemical, and surface sensitive analytical techniques have already demonstrated their potential at the macro- and microscopic level, i.e. identifying which molecular species are present, their concentration, their location, and — ideally - the kinetics, dynamics of the involved molecular processes. In contrast to approaches utilizing individual analytical techniques, the development of generic multifunctional analytical platforms orchestrates a suite of complementary measurement techniques to cooperatively investigate complex biological systems, complemented by the development of (bio)sensing chemistries, synthetic molecular receptors, multivariate evaluation techniques, and micro/nanofabrication for functional system miniaturization. Thereby, we capitalize on the benefits of several analytical techniques addressing the conformational, electrochemical, and spectroscopic properties of the sample leading toward simultaneous rather than the classical sequential information acquisition process, aiming at maximizing the synchronicity between multiple methods in the temporal and spatial domain.

Thumbnail Image
Item

High precision multi-genome scale reannotation of enzyme function by EFICAz

2006-12-13 , Arakaki, Adrian K. , Tian, Weidong , Skolnick, Jeffrey

Background: The functional annotation of most genes in newly sequenced genomes is inferred from similarity to previously characterized sequences, an annotation strategy that often leads to erroneous assignments. We have performed a reannotation of 245 genomes using an updated version of EFICAz, a highly precise method for enzyme function prediction. Results: Based on our three-field EC number predictions, we have obtained lower-bound estimates for the average enzyme content in Archaea (29%), Bacteria (30%) and Eukarya (18%). Most annotations added in KEGG from 2005 to 2006 agree with EFICAz predictions made in 2005. The coverage of EFICAz predictions is significantly higher than that of KEGG, especially for eukaryotes. Thousands of our novel predictions correspond to hypothetical proteins. We have identified a subset of 64 hypothetical proteins with low sequence identity to EFICAz training enzymes, whose biochemical functions have been recently characterized and find that in 96% (84%) of the cases we correctly identified their three-field (four-field) EC numbers. For two of the 64 hypothetical proteins: PA1167 from Pseudomonas aeruginosa, an alginate lyase (EC 4.2.2.3) and Rv1700 of Mycobacterium tuberculosis H37Rv, an ADP-ribose diphosphatase (EC 3.6.1.13), we have detected annotation lag of more than two years in databases. Two examples are presented where EFICAz predictions act as hypothesis generators for understanding the functional roles of hypothetical proteins: FLJ11151, a human protein overexpressed in cancer that EFICAz identifies as an endopolyphosphatase (EC 3.6.1.10), and MW0119, a protein of Staphylococcus aureus strain MW2 that we propose as candidate virulence factor based on its EFICAz predicted activity, sphingomyelin phosphodiesterase (EC 3.1.4.12). Conclusion: Our results suggest that we have generated enzyme function annotations of high precision and recall. These predictions can be mined and correlated with other information sources to generate biologically significant hypotheses and can be useful for comparative genome analysis and automated metabolic pathway reconstruction.

Thumbnail Image
Item

Structure Modeling of All Identified G Protein–Coupled Receptors in the Human Genome

2006-02 , Zhang, Yang , DeVries, Mark E. , Skolnick, Jeffrey

G protein–coupled receptors (GPCRs), encoded by about 5% of human genes, comprise the largest family of integral membrane proteins and act as cell surface receptors responsible for the transduction of endogenous signal into a cellular response. Although tertiary structural information is crucial for function annotation and drug design, there are few experimentally determined GPCR structures. To address this issue, we employ the recently developed threading assembly refinement (TASSER) method to generate structure predictions for all 907 putative GPCRs in the human genome. Unlike traditional homology modeling approaches, TASSER modeling does not require solved homologous template structures; moreover, it often refines the structures closer to native. These features are essential for the comprehensive modeling of all human GPCRs when close homologous templates are absent. Based on a benchmarked confidence score, approximately 820 predicted models should have the correct folds. The majority of GPCR models share the characteristic seven-transmembrane helix topology, but 45 ORFs are predicted to have different structures. This is due to GPCR fragments that are predominantly from extracellular or intracellular domains as well as database annotation errors. Our preliminary validation includes the automated modeling of bovine rhodopsin, the only solved GPCR in the Protein Data Bank. With homologous templates excluded, the final model built by TASSER has a global Ca root-mean-squared deviation from native of 4.6 A°, with a root-mean-squared deviation in the transmembrane helix region of 2.1A°. Models of several representative GPCRs are compared with mutagenesis and affinity labeling data, and consistent agreement is demonstrated. Structure clustering of the predicted models shows that GPCRs with similar structures tend to belong to a similar functional class even when their sequences are diverse. These results demonstrate the usefulness and robustness of the in silico models for GPCR functional analysis.

Thumbnail Image
Item

Hybrid Experiments: Linking Real-Time Simulations to In Vitro Electrophysiology Experiments

2007-10-30 , Butera, Robert J.

Thumbnail Image
Item

Ab initio protein structure prediction using chunk-TASSER

2007-09 , Zhou, Hongyi , Skolnick, Jeffrey

We have developed an ab initio protein structure prediction method called chunk-TASSER that uses ab initio folded supersecondary structure chunks of a given target as well as threading templates for obtaining contact potentials and distance restraints. The predicted chunks, selected on the basis of a new fragment comparison method, are folded by a fragment insertion method. Full-length models are built and refined by the TASSER methodology, which searches conformational space via parallel hyperbolic Monte Carlo. We employ an optimized reduced force field that includes knowledge-based statistical potentials and restraints derived from the chunks as well as threading templates. The method is tested on a dataset of 425 hard target proteins 0;250 amino acids in length. The average TM-scores of the best of top five models per target are 0.266, 0.336, and 0.362 by the threading algorithm SP3, original TASSER and chunk-TASSER, respectively. For a subset of 80 proteins with predicted a-helix content "'50%, these averages are 0.284, 0.356, and 0.403, respectively. The percentages of proteins with the best of top five models having TM-score "'0.4 (a statistically significant threshold for structural similarity) are 3.76, 20.94, and 28.94% by SP3, TASSER, and chunk-TASSER, respectively, overall, while for the subset of 80 predominantly helical proteins, these percentages are 2.50, 23.75, and 41.25%. Thus, chunk-TASSER shows a significant improvement over TASSER for modeling hard targets where no good template can be identified. We also tested chunk-TASSER on 21 mediumlhard targets <200 amino-acids-Iongfrom CASP7. Chunk-TASSER is -11% (10%) better than TASSER for the total TM-score of the first (best of top five) models. ChunkTASSER is fully automated and can be used in proteome scale protein structure prediction.

Thumbnail Image
Item

TASSER-Lite: an automated tool for protein comparative modeling

2006-12 , Pandit, Shashi Bhushan , Zhang, Yang , Skolnick, Jeffrey

This study involves the development of a rapid comparative modeling tool for homologous sequences by extension of the TASSER methodology, developed for tertiary structure prediction. This comparative modeling procedure was validated on a representative benchmark set of proteins in the Protein Data Bank composed of 901 single domain proteins (41- 200 residues) having sequence identities between 35-90% with respect to the template. Using a Monte Carta search scheme with the length of runs optimized lor weakly/nonhomologous proteins, TASSER often provides appreciable improvement in structure quality over the initial template. However, on average, this requires - 29 h of CPU time per sequence. Since homologous proteins are unlikely to require the extent of conformational search as weakly/nonhomologous proteins, TASSER's parameters were optimized to reduce the required CPU time to - 17 min, while retaining TASSER's ability to improve structure quality. Using this optimized TASSER (T ASSER-Lite), we find an average improvement in the aligned region of - 10% in root mean-square deviation from native over the initial template. Comparison of TASSER-Lite with the widely used comparative modeling tool MODELLER showed that TASSER-Lite yields final models that are closer to the native. TASSER-Lite is provided on the web at http://cssb.biology.gatech.edulskolnicklwebserviceltassertiteflndex.html.

Thumbnail Image
Item

Relating Cellular to Molecular Specificity – the Recognition Mechanism of Hox Proteins and Cadherins

2007-10-23 , Honig, Barry

Thumbnail Image
Item

Ab initio modeling of small proteins by iterative TASSER simulations

2007-05-08 , Wu, Sitao , Skolnick, Jeffrey , Zhang, Yang

Background: Predicting 3-dimensional protein structures from amino-acid sequences is an important unsolved problem in computational structural biology. The problem becomes relatively easier if close homologous proteins have been solved, as high-resolution models can be built by aligning target sequences to the solved homologous structures. However, for sequences without similar folds in the Protein Data Bank (PDB) library, the models have to be predicted from scratch. Progress in the ab initio structure modeling is slow. The aim of this study was to extend the TASSER (threading/assembly/refinement) method for the ab initio modeling and examine systemically its ability to fold small single-domain proteins. Results: We developed I-TASSER by iteratively implementing the TASSER method, which is used in the folding test of three benchmarks of small proteins. First, data on 16 small proteins (< 90 residues) were used to generate I-TASSER models, which had an average Cα-root mean square deviation (RMSD) of 3.8Å, with 6 of them having a Cα-RMSD < 2.5Å. The overall result was comparable with the all-atomic ROSETTA simulation, but the central processing unit (CPU) time by I-TASSER was much shorter (150 CPU days vs. 5 CPU hours). Second, data on 20 small proteins (< 120 residues) were used. I-TASSER folded four of them with a Cα-RMSD < 2.5Å. The average Cα-RMSD of the I-TASSER models was 3.9Å, whereas it was 5.9Å using TOUCHSTONE-II software. Finally, 20 non-homologous small proteins (< 120 residues) were taken from the PDB library. An average Cα-RMSD of 3.9Å was obtained for the third benchmark, with seven cases having a Cα-RMSD < 2.5Å. Conclusion: Our simulation results show that I-TASSER can consistently predict the correct folds and sometimes high-resolution models for small single-domain proteins. Compared with other ab initio modeling methods such as ROSETTA and TOUCHSTONE II, the average performance of ITASSER is either much better or is similar within a lower computational time. These data, together with the significant performance of automated I-TASSER server (the Zhang-Server) in the 'free modeling' section of the recent Critical Assessment of Structure Prediction (CASP)7 experiment, demonstrate new progresses in automated ab initio model generation. The I-TASSER server is freely available for academic users http://zhang.bioinformatics.ku.edu/I-TASSER.

Thumbnail Image
Item

Onset of anthrax toxin pore formation

2006-05 , Gao, Mu , Schulten, Klaus

Protective antigen (PA) is the anthrax toxin protein recognized by capillary morphogenesis gene 2 (CMG2), a transmembrane cellular receptor. Upon activation, seven ligand-receptor units self-assemble into a heptameric ring-like complex that becomes endocytozed by the host cell. A critical step in the subsequent intoxication process is the formation and insertion of a pore into the endosome membrane by PA. The pore conversion requires a change in binding between PA and its receptor in the acidified endosome environment. Molecular dynamics simulations totaling ;136 ns on systems of over 92,000 atoms were performed. The simulations revealed how the PA-CMG2 complex, stable at neutral conditions, becomes transformed at low pH upon protonation of His-121 and Glu-122, two conserved amino acids of the receptor. The protonation disrupts a salt bridge important for the binding stability and leads to the detachment of PA domain II, which weakens the stability of the PA-CMG2 complex significantly, and subsequently releases a PA segment needed for pore formation. The simulations also explain the great strength of the PA-CMG2 complex achieves through extraordinary coordination of a divalent cation.