The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote

Top Cited Papers

Open Access

3 April 2013

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 41 (10), e108
https://doi.org/10.1093/nar/gkt214

Abstract

Read alignment is an ongoing challenge for the analysis of data from sequencing technologies. This article proposes an elegantly simple multi-seed strategy, called seed-and-vote, for mapping reads to a reference genome. The new strategy chooses the mapped genomic location for the read directly from the seeds. It uses a relatively large number of short seeds (called subreads) extracted from each read and allows all the seeds to vote on the optimal location. When the read length is <160 bp, overlapping subreads are used. More conventional alignment algorithms are then used to fill in detailed mismatch and indel information between the subreads that make up the winning voting block. The strategy is fast because the overall genomic location has already been chosen before the detailed alignment is done. It is sensitive because no individual subread is required to map exactly, nor are individual subreads constrained to map close by other subreads. It is accurate because the final location must be supported by several different subreads. The strategy extends easily to find exon junctions, by locating reads that contain sets of subreads mapping to different exons of the same gene. It scales up efficiently for longer reads.

Keywords

GENOME

This publication has 51 references indexed in Scilit:

The GEM mapper: fast, accurate and versatile alignment by filtration
Nature Methods, 2012
Fast gapped-read alignment with Bowtie 2
Nature Methods, 2012
SHRiMP2: Sensitive yet Practical Short Read Mapping
Bioinformatics, 2011
Anatomy of a hash-based long read sequence mapping algorithm for next generation DNA sequencing
Bioinformatics, 2010
A map of human genome variation from population-scale sequencing
Nature, 2010
mrsFAST: a cache-oblivious algorithm for short-read mapping
Nature Methods, 2010
A survey of sequence alignment algorithms for next-generation sequencing
Briefings in Bioinformatics, 2010
Fast and accurate short read alignment with Burrows–Wheeler transform
Bioinformatics, 2009
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
Genome Biology, 2009
Mapping short DNA sequencing reads and calling variants using mapping quality scores
Genome Research, 2008

Cited by 2620 articles