Las Vegas Algorithms for Gene Recognition: Suboptimal and Error-Tolerant Spliced Alignment
- 1 January 1997
- journal article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 4 (3), 297-309
- https://doi.org/10.1089/cmb.1997.4.297
Abstract
Recently, Gelfand, Mironov and Pevzner (1996) proposed a spliced alignment approach to gene recognition that provides 99% accurate recognition of human genes if a related mammalian protein is available. However, even 99% accurate gene predictions are insufficient for automated sequence annotation in large-scale sequencing projects and therefore have to be complemented by experimental gene verification. One hundred percent accurate gene predictions would lead to a substantial reduction of experimental work on gene identification. Our goal is to develop an algorithm that either predicts an exon assembly with accuracy sufficient for sequence annotation or warns a biologist that the accuracy of a prediction is insufficient and further experimental work is required. We study suboptimal and error-tolerant spliced alignment problems as the first steps towards such an algorithm, and report an algorithm which provides 100% accurate recognition of human genes in 37% of cases (if a related mammalian protein is available). In 52% of genes, the algorithm predicts at least one exon with 100% accuracy.Keywords
This publication has 21 references indexed in Scilit:
- Gene recognition via spliced sequence alignment.Proceedings of the National Academy of Sciences, 1996
- Evaluation of Gene Structure Prediction ProgramsGenomics, 1996
- The gene identification problem: An overview for developersComputers & Chemistry, 1996
- The turning point in genome researchTrends in Biochemical Sciences, 1995
- Prediction of Function in DNA Sequence AnalysisJournal of Computational Biology, 1995
- Recent Developments in Linear-Space Alignment Methods: A SurveyJournal of Computational Biology, 1994
- Identification of protein coding regions by database similarity searchNature Genetics, 1993
- Prediction of the exon-intron structure by a dynamic programming approachBiosystems, 1993
- Amino acid substitution matrices from an information theoretic perspectiveJournal of Molecular Biology, 1991
- Computer prediction of the exon-intron structure of mammalian pre-mRNAsNucleic Acids Research, 1990