Skolnick, Jeffrey

Associated Organization(s)
Organizational Unit
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 10 of 71
  • Item
    Development of a Comprehensive Integrated Platform for Translational Innovation in Pain Opioid Abuse Disorder and Overdose
    (Georgia Institute of Technology, 2022) Skolnick, Jeffrey
    Video summary of research project "Development of a Comprehensive Integrated Platform for Translational Innovation in Pain Opioid Abuse Disorder and Overdose"
  • Item
    The Possible Origin of the Biochemical Function of Proteins and its Implications for the Origin of Life
    (Georgia Institute of Technology, 2020-03-10) Skolnick, Jeffrey
    Living systems have chiral molecules,; e.g., native proteins almost entirely contain L-amino acids. How protein homochirality emerged from a background of equal numbers of L and D amino acids is among many questions about life’s origin. The origin of homochirality and its implications are explored in computer simulations examining the stability, structural and functional properties of an artificial library of compact proteins containing 1:1, termed demi-chiral, 3:1 and 1:3 ratios of D:L and purely L or D amino acids generated without functional selection. Demi-chiral proteins have shorter secondary structures, fewer internal hydrogen bonds, and are less stable than homochiral proteins. Selection for hydrogen bonding yields a preponderance of L or D amino acids. Demi-chiral proteins have native global folds, including similarity to early ribosomal proteins, similar small molecule ligand binding pocket geometries, and many constellations of L-chiral amino acids with a 1.0 Å RMSD to native enzyme active sites. For a representative subset containing 550 active site geometries matching 457 (2) four (three) E.C digits, native active site amino acids were generated at random for 472/550 cases. This increases to 548/550 cases when similar residues are allowed. The most frequently generated sequences correspond to ancient enzymatic functions, e.g., glycolysis, replication, and nucleotide biosynthesis. Surprisingly, even without selection, demi-chiral proteins possess the requisite marginal biochemical function and structure of modern proteins, but were thermodynamically less stable. If demi-chiral proteins were present, they could engage in early metabolism, which created the feedback loop for transcription and cell formation.
  • Item
    Krylov subspace methods for computing hydrodynamic interactions in Brownian dynamics simulations
    (Georgia Institute of Technology, 2012-08) Ando, Tadashi ; Chow, Edmond ; Saad, Yousef ; Skolnick, Jeffrey
    Hydrodynamic interactions play an important role in the dynamics of macromolecules. The most common way to take into account hydrodynamic effects in molecular simulations is in the context of a Brownian dynamics simulation. However, the calculation of correlated Brownian noise vectors in these simulations is computationally very demanding and alternative methods are desirable. This paper studies methods based on Krylov subspaces for computing Brownian noise vectors. These methods are related to Chebyshev polynomial approximations, but do not require eigenvalue estimates. We show that only low accuracy is required in the Brownian noise vectors to accurately compute values of dynamic and static properties of polymer and monodisperse suspension models. With this level of accuracy, the computational time of Krylov subspace methods scales very nearly as O(N²) for the number of particles N up to 10 000, which was the limit tested. The performance of the Krylov subspace methods, especially the “block” version, is slightly better than that of the Chebyshev method, even without taking into account the additional cost of eigenvalue estimates required by the latter. Furthermore, at N = 10 000, the Krylov subspace method is 13 times faster than the exact Cholesky method. Thus, Krylov subspace methods are recommended for performing largescale Brownian dynamics simulations with hydrodynamic interactions.
  • Item
    GOAP: A Generalized Orientation-Dependent, All-Atom Statistical Potential for Protein Structure Prediction
    (Georgia Institute of Technology, 2011-10) Zhou, Hongyi ; Skolnick, Jeffrey
    An accurate scoring function is a key component for successful protein structure prediction. To address this important unsolved problem, we develop a generalized orientation and distance-dependent all-atom statistical potential. The new statistical potential, generalized orientation-dependent all-atom potential (GOAP), depends on the relative orientation of the planes associated with each heavy atom in interacting pairs. GOAP is a generalization of previous orientation-dependent potentials that consider only representative atoms or blocks of side-chain or polar atoms. GOAP is decomposed into distance- and angle-dependent contributions. The DFIRE distance-scaled finite ideal gas reference state is employed for the distance-dependent component of GOAP. GOAP was tested on 11 commonly used decoy sets containing 278 targets, and recognized 226 native structures as best from the decoys, whereas DFIRE recognized 127 targets. The major improvement comes from decoy sets that have homology-modeled structures that are close to native (all within ∼4.0 Å) or from the ROSETTA ab initio decoy set. For these two kinds of decoys, orientation-independent DFIRE or only side-chain orientation-dependent RWplus performed poorly. Although the OPUS-PSP block-based orientation-dependent, side-chain atom contact potential performs much better (recognizing 196 targets) than DFIRE, RWplus, and dDFIRE, it is still ∼15% worse than GOAP. Thus, GOAP is a promising advance in knowledge-based, all-atom statistical potentials. GOAP is available for download at
  • Item
    Brownian dynamics simulation of macromolecule diffusion in a protocell
    (Georgia Institute of Technology, 2011) Ando, Tadashi ; Skolnick, Jeffrey
    The interiors of all living cells are highly crowded with macro molecules, which differs considerably the thermodynamics and kinetics of biological reactions between in vivo and in vitro. For example, the diffusion of green fluorescent protein (GFP) in E. coli is ~10-fold slower than in dilute conditions. In this study, we performed Brownian dynamics (BD) simulations of rigid macromolecules in a crowded environment mimicking the cytosol of E. coli to study the motions of macromolecules. The simulation systems contained 35 70S ribosomes, 750 glycolytic enzymes, 75 GFPs, and 392 tRNAs in a 100 nm × 100 nm × 100 nm simulation box, where the macromolecules were represented by rigid-objects of one bead per amino acid or four beads per nucleotide models. Diffusion tensors of these molecules in dilute solutions were estimated by using a hydrodynamic theory to take into account the diffusion anisotropy of arbitrary shaped objects in the BD simulations. BD simulations of the system where each macromolecule is represented by its Stokes radius were also performed for comparison. Excluded volume effects greatly reduce the mobility of molecules in crowded environments for both molecular-shaped and equivalent sphere systems. Additionally, there were no significant differences in the reduction of diffusivity over the entire range of molecular size between two systems. However, the reduction in diffusion of GFP in these systems was still 4-5 times larger than for the in vivo experiment. We will discuss other plausible factors that might cause the large reduction in diffusion in vivo.
  • Item
    TASSER_WT: A protein structure prediction algorithm with accurate predicted contact restraints for difficult protein targets
    (Georgia Institute of Technology, 2010-11) Lee, Seung Yup ; Skolnick, Jeffrey
    To improve the prediction accuracy in the regime where template alignment quality is poor, an updated version of TASSER_2.0, namely TASSER_WT, was developed. TASSER_WT incorporates more accurate contact restraints from a new method, COMBCON. COMBCON uses confidence-weighted contacts from PROSPECTOR_3.5, the latest version, PROSPECTOR_4, and a new local structural fragment-based threading algorithm, STITCH, implemented in two variants depending on expected fragment prediction accuracy. TASSER_WT is tested on 622 Hard proteins, the most difficult targets (incorrect alignments and/or templates and incorrect side-chain contact restraints) in a comprehensive benchmark of 2591 nonhomologous, single domain proteins %200 residues that cover the PDB at 35% pairwise sequence identity. For 454 of 622 Hard targets, COMBCON provides contact restraints with higher accuracy and number of contacts per residue. As contact coverage with confidence weight R3 (FwtR3 cov) increases, the more improved are TASSER_WT models. When FwtR3 cov > 1.0 and > 0.4, the average root mean-square deviation of TASSER_WT (TASSER_2.0) models is 4.11 A° (6.72 A° ) and 5.03 A° (6.40A° ), respectively. Regarding a structure prediction as successful when a model has a TM-score to the native structureR0.4, when FwtR3 cov > 1.0 and > 0.4, the success rate of TASSER_WT (TASSER_2.0) is 98.8% (76.2%) and 93.7% (81.1%), respectively.
  • Item
    A Threading-Based Method for the Prediction of DNABinding Proteins with Application to the Human GenomeProteins with Application to the Human Genome
    (Georgia Institute of Technology, 2009-11-13) Gao, Mu ; Skolnick, Jeffrey
    Diverse mechanisms for DNA-protein recognition have been elucidated in numerous atomic complex structures from various protein families. These structural data provide an invaluable knowledge base not only for understanding DNA protein interactions, but also for developing specialized methods that predict the DNA-binding function from protein structure. While such methods are useful, a major limitation is that they require an experimental structure of the target as input. To overcome this obstacle, we develop a threading-based method, DNA-Binding-Domain-Threader (DBD-Threader, for the prediction of DNA-binding domains and associated DNA-binding protein residues. Our method, which uses a template library composed of DNA-protein complex structures, requires only the target protein’s sequence. In our approach,fold similarity and DNA-binding propensity are employed as two functional discriminating properties. In benchmark tests on 179 DNA-binding and 3,797 non-DNA-binding proteins, using templates whose sequence identity is less than 30% to the target, DBD-Threader achieves a sensitivity/precision of 56%/86%. This performance is considerably better than the standard sequence comparison method PSI-BLAST and is comparable to DBD-Hunter, which requires an experimental structure as input. Moreover, for over 70% of predicted DNA-binding domains, the backbone Root Mean Square Deviations (RMSDs) of the top-ranked structural models are within 6.5 A°of their experimental structures, with their associated DNA binding sites identified at satisfactory accuracy. Additionally, DBD-Threader correctly assigned the SCOP superfamily for most predicted domains. To demonstrate that DBD-Threader is useful for automatic function annotation on a large-scale, DBD-Threader was applied to 18,631 protein sequences from the human genome; 1,654 proteins are predicted to have DNA-binding function. Comparison with existing Gene Ontology (GO) annotations suggests that ,30% of our predictions are new. Finally, we present some interesting predictions in detail. In particular, it is estimated that 20% of classic zinc finger domains play a functional role not related to direct DNA-binding.
  • Item
    FINDSITE LHM: A Threading-Based Approach to Ligand Homology Modeling
    (Georgia Institute of Technology, 2009-06-05) Brylinski, Michal ; Skolnick, Jeffrey
    Ligand virtual screening is a widely used tool to assist in new pharmaceutical discovery. In practice, virtual screening approaches have a number of limitations, and the development of new methodologies is required. Previously, we showed that remotely related proteins identified by threading often share a common binding site occupied by chemically similar ligands. Here, we demonstrate that across an evolutionarily related, but distant family of proteins, the ligands that bind to the common binding site contain a set of strongly conserved anchor functional groups as well as a variable region that accounts for their binding specificity. Furthermore, the sequence and structure conservation of residues contacting the anchor functional groups is significantly higher than those contacting ligand variable regions. Exploiting these insights, we developed FINDSITELHM that employs structural information extracted from weakly related proteins to perform rapid ligand docking by homology modeling. In large scale benchmarking, using the predicted anchor-binding mode and the crystal structure of the receptor, FINDSITELHM outperforms classical docking approaches with an average ligand RMSD from native of ,2.5 A° . For weakly homologous receptor protein models, using FINDSITELHM, the fraction of recovered binding residues and specific contacts is 0.66 (0.55) and 0.49 (0.38) for highly confident (all) targets, respectively. Finally, in virtual screening for HIV-1 protease inhibitors, using similarity to the ligand anchor region yields significantly improved enrichment factors. Thus, the rather accurate, computationally inexpensive FINDSITELHM algorithm should be a useful approach to assist in the discovery of novel biopharmaceuticals.
  • Item
    EFICAz²: enzyme function inference by a combined approach enhanced by machine learning
    (Georgia Institute of Technology, 2009-04-13) Arakaki, Adrian K. ; Huang, Ying ; Skolnick, Jeffrey
    Background: We previously developed EFICAz, an enzyme function inference approach that combines predictions from non-completely overlapping component methods. Two of the four components in the original EFICAz are based on the detection of functionally discriminating residues (FDRs). FDRs distinguish between member of an enzyme family that are homofunctional (classified under the EC number of interest) or heterofunctional (annotated with another EC number or lacking enzymatic activity). Each of the two FDR-based components is associated to one of two specific kinds of enzyme families. EFICAz exhibits high precision performance, except when the maximal test to training sequence identity (MTTSI) is lower than 30%. To improve EFICAz's performance in this regime, we: i) increased the number of predictive components and ii) took advantage of consensual information from the different components to make the final EC number assignment. Results: We have developed two new EFICAz components, analogs to the two FDR-based components, where the discrimination between homo and heterofunctional members is based on the evaluation, via Support Vector Machine models, of all the aligned positions between the query sequence and the multiple sequence alignments associated to the enzyme families. Benchmark results indicate that: i) the new SVM-based components outperform their FDR-based counterparts, and ii) both SVM-based and FDR-based components generate unique predictions. We developed classification tree models to optimally combine the results from the six EFICAz components into a final EC number prediction. The new implementation of our approach, EFICAz², exhibits a highly improved prediction precision at MTTSI < 30% compared to the original EFICAz, with only a slight decrease in prediction recall. A comparative analysis of enzyme function annotation of the human proteome by EFICAz² and KEGG shows that: i) when both sources make EC number assignments for the same protein sequence, the assignments tend to be consistent and ii) EFICAz² generates considerably more unique assignments than KEGG. Conclusion: Performance benchmarks and the comparison with KEGG demonstrate that EFICAz² is a powerful and precise tool for enzyme function annotation, with multiple applications in genome analysis and metabolic pathway reconstruction. The EFICAz² web service is available at: