Sequence errors described in GenBank: a means to determine the accuracy of DNA sequence interpretation

1 January 1989

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 17 (10), 3951-3957
https://doi.org/10.1093/nar/17.10.3951

Abstract

The accuracy of nucleic acid sequence data interpretation was determined by assessing and quantifying the discrepancies reported in the GenBandk database. This permitted the calculation of an Error Rate (ER) for nucleic acid sequence determination. If one assumes that most entries (TB, Total Bases) were independently verified or those without reported discrepancies were correct, the ER is 0.368 errors per 1000 bases. However, if one assumes that only those sequences with reported discrepancies (TBIQ, Total Bases from entries In Question) were verified and are thus correct the ER is 2.887 errors per 1000 bases. This establishes the first set of limit boundaries of the ER for sequence interpretation and sequence errors within the GenBank database and provides the foundation for future assessments and the monitoring of sequence data accumulation. In addition, the ER measure provides a basis to evaluate the efficiency and merit of present and future automated nucleic acid sequencing technologies which will have a direct impact upon the final outcome of the "Human Genome Initiative".

This publication has 10 references indexed in Scilit:

The Accuracy of Reverse Transcriptase from HIV-1
Science, 1988
Fidelity of DNA synthesis by the Thermus aquaticus DNA polymerase
Biochemistry, 1988
Computer video acquisition and analysis system for biological data
Bioinformatics, 1988
The EMBL data library
Nucleic Acids Research, 1988
The GenBank^®genetic sequence data bank
Nucleic Acids Research, 1988
Fluorescence detection in automated DNA sequence analysis
Nature, 1986
Automatic reading of DNA sequencing gel autoradiographs using a large format digital scanner
Nucleic Acids Research, 1986
Fidelity of Mammalian DNA Polymerases
Science, 1981
DNA sequencing with chain-terminating inhibitors
Proceedings of the National Academy of Sciences, 1977
A new method for sequencing DNA.
Proceedings of the National Academy of Sciences, 1977

Cited by 27 articles