ALLPATHS: De novo assembly of whole-genome shotgun microreads
Top Cited Papers
Open Access
- 13 March 2008
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 18 (5), 810-820
- https://doi.org/10.1101/gr.7337908
Abstract
New DNA sequencing technologies deliver data at dramatically lower costs but demand new analytical methods to take full advantage of the very short reads that they produce. We provide an initial, theoretical solution to the challenge of de novo assembly from whole-genome shotgun “microreads.” For 11 genomes of sizes up to 39 Mb, we generated high-quality assemblies from 80× coverage by paired 30-base simulated reads modeled after real Illumina-Solexa reads. The bacterial genomes of Campylobacter jejuni and Escherichia coli assemble optimally, yielding single perfect contigs, and larger genomes yield assemblies that are highly connected and accurate. Assemblies are presented in a graph form that retains intrinsic ambiguities such as those arising from polymorphism, thereby providing information that has been absent from previous genome assemblies. For both C. jejuni and E. coli, this assembly graph is a single edge encompassing the entire genome. Larger genomes produce more complicated graphs, but the vast majority of the bases in their assemblies are present in long edges that are nearly always perfect. We describe a general method for genome assembly that can be applied to all types of DNA sequence data, not only short read data, but also conventional sequence reads.Keywords
This publication has 11 references indexed in Scilit:
- SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencingGenome Research, 2007
- Extending assembly of short DNA sequences to handle errorBioinformatics, 2007
- Genome-wide maps of chromatin state in pluripotent and lineage-committed cellsNature, 2007
- Genome-Wide Mapping of in Vivo Protein-DNA InteractionsScience, 2007
- Assembling millions of short DNA sequences using SSAKEBioinformatics, 2006
- Accurate Multiplex Polony Sequencing of an Evolved Bacterial GenomeScience, 2005
- ARACHNE: a whole-genome shotgun assembler.2002
- An Eulerian path approach to DNA fragment assemblyProceedings of the National Academy of Sciences, 2001
- A map of human genome sequence variation containing 1.42 million single nucleotide polymorphismsNature, 2001
- DNA sequencing with chain-terminating inhibitorsProceedings of the National Academy of Sciences, 1977