Semiautomated improvement of RNA alignments
- 5 September 2007
- journal article
- Published by Cold Spring Harbor Laboratory in RNA
- Vol. 13 (11), 1850-1859
- https://doi.org/10.1261/rna.215407
Abstract
We have developed a semiautomated RNA sequence editor (SARSE) that integrates tools for analyzing RNA alignments. The editor highlights different properties of the alignment by color, and its integrated analysis tools prevent the introduction of errors when doing alignment editing. SARSE readily connects to external tools to provide a flexible semiautomatic editing environment. A new method, Pcluster, is introduced for dividing the sequences of an RNA alignment into subgroups with secondary structure differences. Pcluster was used to evaluate 574 seed alignments obtained from the Rfam database and we identified 71 alignments with significant prediction of inconsistent base pairs and 102 alignments with significant prediction of novel base pairs. Four RNA families were used to illustrate how SARSE can be used to manually or automatically correct the inconsistent base pairs detected by Pcluster: the mir-399 RNA, vertebrate telomase RNA (vert-TR), bacterial transfer-messenger RNA (tmRNA), and the signal recognition particle (SRP) RNA. The general use of the method is illustrated by the ability to accommodate pseudoknots and handle even large and divergent RNA families. The open architecture of the SARSE editor makes it a flexible tool to improve all RNA alignments with relatively little human intervention. Online documentation and software are available at http://sarse.ku.dk.Keywords
This publication has 39 references indexed in Scilit:
- Predicting a set of minimal free energy RNA secondary structures common to two sequencesBioinformatics, 2005
- Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%Bioinformatics, 2005
- Evolutionary rate variation and RNA secondary structure predictionComputational Biology and Chemistry, 2004
- The Jalview Java alignment editorBioinformatics, 2004
- Secondary Structure Prediction for Aligned RNA SequencesJournal of Molecular Biology, 2002
- RNAML: A standard syntax for exchanging RNA informationRNA, 2002
- Dynalign: an algorithm for finding the secondary structure common to two RNA sequencesJournal of Molecular Biology, 2002
- MView: a web-compatible database search or multiple alignment viewer.Bioinformatics, 1998
- DCSE, an interactive tool for sequence alignment and secondary structure researchBioinformatics, 1993
- Selection of representative protein data setsProtein Science, 1992