Title:
STORI: selectable taxon ortholog retrieval iteratively

dc.contributor.advisor Gaucher, Eric A.
dc.contributor.author Stern, Joshua Gallant
dc.contributor.committeeMember Hammer, Brian K.
dc.contributor.committeeMember Dunham, Christine M.
dc.contributor.committeeMember Jordan, Irving K.
dc.contributor.committeeMember Snell, Terry W.
dc.contributor.department Biology
dc.date.accessioned 2015-06-08T17:51:00Z
dc.date.available 2015-06-08T17:51:00Z
dc.date.created 2013-12
dc.date.issued 2013-08-07
dc.date.submitted December 2013
dc.date.updated 2015-06-08T17:51:00Z
dc.description.abstract Speciation and gene duplication are fundamental evolutionary processes that enable biological innovation. For over a decade, biologists have endeavored to distinguish orthology (homology caused by speciation) from paralogy (homology caused by duplication). Disentangling orthology and paralogy is useful to diverse fields such as phylogenetics, protein engineering, and genome content comparison. A common step in ortholog detection is the computation of Bidirectional Best Hits (BBH). However, we found this computation impractical for more than 24 Eukaryotic proteomes. Attempting to retrieve orthologs in less time than previous methods require, we developed a novel algorithm and implemented it as a suite of Perl scripts. This software, Selectable Taxon Ortholog Retrieval Iteratively (STORI), retrieves orthologous protein sequences for a set of user-defined proteomes and query sequences. While the time complexity of the BBH method is O(#taxa^2), we found that the average CPU time used by STORI may increase linearly with the number of taxa. To demonstrate one aspect of STORI’s usefulness, we used this software to infer the orthologous sequences of 26 ribosomal proteins (rProteins) from the large ribosomal subunit (LSU), for a set of 115 Bacterial and 94 Archaeal proteomes. Next, we used established tree-search methods to seek the most probable evolutionary explanation of these data. The current implementation of STORI runs on Red Hat Enterprise Linux 6.0 with installations of Moab 5.3.7, Perl 5 and several Perl modules. STORI is available at: <http://github.com/jgstern/STORI>.
dc.description.degree M.S.
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/53377
dc.language.iso en_US
dc.publisher Georgia Institute of Technology
dc.subject Ortholog prediction
dc.subject Protein sequence retrieval
dc.subject Modeling bacterial evolution
dc.subject Fusobacteria
dc.subject Molecular sequence data management
dc.subject Bayesian inference phylogenomics
dc.title STORI: selectable taxon ortholog retrieval iteratively
dc.type Text
dc.type.genre Thesis
dc.type.genre Thesis
dspace.entity.type Publication
local.contributor.corporatename College of Sciences
local.contributor.corporatename School of Biological Sciences
relation.isOrgUnitOfPublication 85042be6-2d68-4e07-b384-e1f908fae48a
relation.isOrgUnitOfPublication c8b3bd08-9989-40d3-afe3-e0ad8d5c72b5
thesis.degree.level Masters
Files
Original bundle
Now showing 1 - 2 of 2
Thumbnail Image
Name:
STERN-THESIS-2013.pdf
Size:
6.63 MB
Format:
Adobe Portable Document Format
Description:
No Thumbnail Available
Name:
STORI_SmarTech.zip
Size:
39.3 MB
Format:
Unknown data format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
3.86 KB
Format:
Plain Text
Description: