Evolutionary conservation of sequence and secondary structures in CRISPR repeats
Top Cited Papers
Open Access
- 18 April 2007
- journal article
- research article
- Published by Springer Nature in Genome Biology
- Vol. 8 (4), 1-7
- https://doi.org/10.1186/gb-2007-8-4-r61
Abstract
Background: Clustered regularly interspaced short palindromic repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in approximately 40% of bacterial and most archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CASs), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been recently shown that CRISPR provides acquired resistance against viruses in prokaryotes. Results: Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. Some of the clusters present stable, highly conserved RNA secondary structures, while others lack detectable structures. Stable secondary structures exhibit multiple compensatory base changes in the stem region, indicating evolutionary and functional conservation. Conclusion: We show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification, including specific relationships between CRISPR and CAS subtypes.Keywords
This publication has 28 references indexed in Scilit:
- An experimental metagenome data management and analysis systemBioinformatics, 2006
- The multidrug-resistant human pathogen Clostridium difficile has a highly mobile, mosaic genomeNature Genetics, 2006
- The Repetitive DNA Elements Called CRISPRs and Their Associated Genes: Evidence of Horizontal Transfer Among ProkaryotesJournal of Molecular Evolution, 2006
- A Guild of 45 CRISPR-Associated (Cas) Protein Families and Multiple CRISPR/Cas Subtypes Exist in Prokaryotic GenomesPLoS Computational Biology, 2005
- Complete sequence and comparative genome analysis of the dairy bacterium Streptococcus thermophilusNature Biotechnology, 2004
- WebLogo: A Sequence Logo Generator: Figure 1Genome Research, 2004
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structureJournal of Molecular Biology, 1999
- RNA–protein complexesCurrent Opinion in Structural Biology, 1999
- Identification of common molecular subsequencesJournal of Molecular Biology, 1981