SeqAn An efficient, generic C++ library for sequence analysis
Top Cited Papers
Open Access
- 9 January 2008
- journal article
- software
- Published by Springer Nature in BMC Bioinformatics
- Vol. 9 (1), 1-9
- https://doi.org/10.1186/1471-2105-9-11
Abstract
The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome [1] would not have been possible without advanced assembly algorithms. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there is a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use. To remedy this trend we propose the use of SeqAn, a library of efficient data types and algorithms for sequence analysis in computational biology. SeqAn comprises implementations of existing, practical state-of-the-art algorithmic components to provide a sound basis for algorithm testing and development. In this paper we describe the design and content of SeqAn and demonstrate its use by giving two examples. In the first example we show an application of SeqAn as an experimental platform by comparing different exact string matching algorithms. The second example is a simple version of the well-known MUMmer tool rewritten in SeqAn. Results indicate that our implementation is very efficient and versatile to use. We anticipate that SeqAn greatly simplifies the rapid development of new bioinformatics tools by providing a collection of readily usable, well-designed algorithmic components which are fundamental for the field of sequence analysis. This leverages not only the implementation of new algorithms, but also enables a sound analysis and comparison of existing algorithms.Keywords
This publication has 34 references indexed in Scilit:
- LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNAGenome Research, 2003
- T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- A Whole-Genome Assembly of DrosophilaScience, 2000
- A fast bit-vector algorithm for approximate string matching based on dynamic programmingJournal of the ACM, 1999
- Object-oriented sequence analysis: SCL-a C+ + class libraryBioinformatics, 1996
- The CGAL kernel: A basis for geometric computationLecture Notes in Computer Science, 1996
- Introduction to AlgorithmsJournal of the Operational Research Society, 1991
- Basic local alignment search toolJournal of Molecular Biology, 1990
- An improved algorithm for matching biological sequencesJournal of Molecular Biology, 1982
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970