Title:
Evolutionary rates and patterns for human transcription factor binding sites derived from repetitive DNA

dc.contributor.author Polavarapu, Nalini en_US
dc.contributor.author Mariño-Ramírez, Leonardo en_US
dc.contributor.author Landsman, David en_US
dc.contributor.author McDonald, John F. en_US
dc.contributor.author Jordan, I. King en_US
dc.contributor.corporatename Georgia Institute of Technology. School of Biology en_US
dc.contributor.corporatename National Center for Biotechnology Information (U.S.) en_US
dc.date.accessioned 2009-10-13T20:15:04Z
dc.date.available 2009-10-13T20:15:04Z
dc.date.issued 2008-05-17
dc.description © 2008 Polavarapu et al; licensee BioMed Central Ltd. The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2164/9/226 en
dc.description DOI:10.1186/1471-2164-9-226
dc.description.abstract Background The majority of human non-protein-coding DNA is made up of repetitive sequences, mainly transposable elements (TEs). It is becoming increasingly apparent that many of these repetitive DNA sequence elements encode gene regulatory functions. This fact has important evolutionary implications, since repetitive DNA is the most dynamic part of the genome. We set out to assess the evolutionary rate and pattern of experimentally characterized human transcription factor binding sites (TFBS) that are derived from repetitive versus non-repetitive DNA to test whether repeat-derived TFBS are in fact rapidly evolving. We also evaluated the position-specific patterns of variation among TFBS to look for signs of functional constraint on TFBS derived from repetitive and non-repetitive DNA. Results We found numerous experimentally characterized TFBS in the human genome, 7–10% of all mapped sites, which are derived from repetitive DNA sequences including simple sequence repeats (SSRs) and TEs. TE-derived TFBS sequences are far less conserved between species than TFBS derived from SSRs and non-repetitive DNA. Despite their rapid evolution, several lines of evidence indicate that TE-derived TFBS are functionally constrained. First of all, ancient TE families, such as MIR and L2, are enriched for TFBS relative to younger families like Alu and L1. Secondly, functionally important positions in TE-derived TFBS, specifically those residues thought to physically interact with their cognate protein binding factors (TF), are more evolutionarily conserved than adjacent TFBS positions. Finally, TE-derived TFBS show position-specific patterns of sequence variation that are highly distinct from random patterns and similar to the variation seen for non-repeat derived sequences of the same TFBS. Conclusion The abundance of experimentally characterized human TFBS that are derived from repetitive DNA speaks to the substantial regulatory effects that this class of sequence has on the human genome. The unique evolutionary properties of repeat-derived TFBS are perhaps even more intriguing. TE-derived TFBS in particular, while clearly functionally constrained, evolve extremely rapidly relative to non-repeat derived sites. Such rapidly evolving TFBS are likely to confer species-specific regulatory phenotypes, i.e. divergent expression patterns, on the human evolutionary lineage. This result has practical implications with respect to the widespread use of evolutionary conservation as a surrogate for functionally relevant non-coding DNA. Most TE-derived TFBS would be missed using the kinds of sequence conservation-based screens, such as phylogenetic footprinting, that are used to help characterize non-coding DNA. Thus, the very TFBS that are most likely to yield human-specific characteristics will be neglected by the comparative genomic techniques that are currently de rigeur for the identification of novel regulatory sites. en
dc.identifier.citation Nalini Polavarapu, Leonardo Mariño-Ramírez, David Landsman, John F. McDonald and I. King Jordan, "Evolutionary rates and patterns for human transcription factor binding sites derived from repetitive DNA," BMC Genomics 2008, 9:226 en
dc.identifier.doi 10.1186/1471-2164-9-226
dc.identifier.issn 1471-2164
dc.identifier.uri http://hdl.handle.net/1853/30450
dc.language.iso en_US en
dc.publisher Georgia Institute of Technology en
dc.publisher.original BioMed Central
dc.subject Computational and functional genomics
dc.subject Gene expression
dc.subject Human genome
dc.subject Transposable elements
dc.subject TE-derived regulatory sequences
dc.title Evolutionary rates and patterns for human transcription factor binding sites derived from repetitive DNA en
dc.type Text
dc.type.genre Article
dspace.entity.type Publication
local.contributor.author Jordan, I. King
local.contributor.author McDonald, John F.
local.contributor.corporatename College of Sciences
local.contributor.corporatename School of Biological Sciences
relation.isAuthorOfPublication 1c155699-6f2d-418d-83cd-9e1424896d4f
relation.isAuthorOfPublication 747c573d-7e00-47e6-bd0c-1532a3dfc720
relation.isOrgUnitOfPublication 85042be6-2d68-4e07-b384-e1f908fae48a
relation.isOrgUnitOfPublication c8b3bd08-9989-40d3-afe3-e0ad8d5c72b5
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
2008-BMC_007.pdf
Size:
546.63 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.86 KB
Format:
Item-specific license agreed upon to submission
Description: