High-quality draft assemblies of mammalian genomes from massively parallel sequence data
Top Cited Papers
- 27 December 2010
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 108 (4), 1513-1518
- https://doi.org/10.1073/pnas.1017351108
Abstract
Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd.Keywords
This publication has 25 references indexed in Scilit:
- Limitations of next-generation genome sequence assemblyNature Methods, 2010
- A map of human genome variation from population-scale sequencingNature, 2010
- Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomesNature Methods, 2009
- A burst of segmental duplications in the genome of the African great ape ancestorNature, 2009
- Accurate whole human genome sequencing using reversible terminator chemistryNature, 2008
- Mouse segmental duplication and copy number variationNature Genetics, 2008
- Genome analysis of the platypus reveals unique signatures of evolutionNature, 2008
- ALLPATHS: De novo assembly of whole-genome shotgun microreadsGenome Research, 2008
- Finishing the euchromatic sequence of the human genomeNature, 2004
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002