Leveraging the Mouse Genome for Gene Prediction in Human: From Whole-Genome Shotgun Reads to a Global Synteny Map
Open Access
- 1 January 2003
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 13 (1), 46-54
- https://doi.org/10.1101/gr.830003
Abstract
The availability of draft sequences for both the mouse and human genomes makes it possible, for the first time, to annotate whole mammalian genomes using comparative methods. TWINSCAN is a gene-prediction system that combines the methods of single-genome predictors like GENSCAN with information derived from genome comparison, thereby improving accuracy. Because TWINSCAN uses genomic sequence only, it is less biased toward highly and/or ubiquitously expressed genes than GENEWISE, GENOMESCAN, and other methods based on evidence derived from transcripts. We show that TWINSCAN improves gene prediction in human using intermediate products from various stages of the sequencing and analysis of the mouse genome, from low-redundancy, whole-genome shotgun reads to the draft assembly and the synteny map. TWINSCAN improves on the prior state of the art even when alignments from only 1X coverage of the mouse genome are available. Gene prediction accuracy improves steadily from 1X through 3X, more slowly from 3X to 4X, and relatively little thereafter. The assembly and the synteny map greatly speed the computations, however. Our human annotation using the mouse assembly is conservative, predicting only 25,622 genes, and appears to be one of the best de novo annotations of the human genome to date.Keywords
This publication has 29 references indexed in Scilit:
- Comparative Gene Prediction in Human and MouseGenome Research, 2003
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- The Human Genome Browser at UCSCGenome Research, 2002
- Applications of Generalized Pair Hidden Markov Models to Alignment and Gene Finding ProblemsJournal of Computational Biology, 2002
- Computational Inference of Homologous Gene Structures in the Human GenomeGenome Research, 2001
- PipMaker—A Web Server for Aligning Two Genomic DNA SequencesGenome Research, 2000
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- Analysis and Comparison of the Mouse and Human Immunoglobulin Heavy Chain JH–Cμ–Cδ LocusMolecular Phylogenetics and Evolution, 1996
- Striking sequence similarity over almost 100 kilobases of human and mouse T–cell receptor DNANature Genetics, 1994
- Prediction of gene structureJournal of Molecular Biology, 1992