A Genome-Wide Survey of Human Pseudogenes
- 1 December 2003
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 13 (12), 2559-2567
- https://doi.org/10.1101/gr.1455503
Abstract
We screened all intergenic regions in the human genome to identify pseudogenes with a combination of homology searches and a functionality test using the ratio of silent to replacement nucleotide substitutions (KA/KS). We identified 19,724 regions of which 95% ± 3% are estimated to evolve neutrally and thus are likely to encode pseudogenes. Half of these have no detectable truncation in their pseudocoding regions and therefore are not identifiable by methods that require the presence of truncations to prove nonfunctionality. A comparative analysis with the mouse genome showed that 70% of these pseudogenes have a retrotranspositional origin (processed), and the rest arose by segmental duplication (nonprocessed). Although the spread of both types of pseudogenes correlates with chromosome size, nonprocessed pseudogenes appear to be enriched in regions with high gene density. It is likely that the human pseudogenes identified here represent only a small fraction of the total, which probably exceeds the number of genes.Keywords
This publication has 32 references indexed in Scilit:
- The DNA sequence of human chromosome 7Nature, 2003
- Different noses for different peopleNature Genetics, 2003
- The InterPro Database, 2003 brings increased coverage and new featuresNucleic Acids Research, 2003
- Reevaluating Human Gene Annotation: A Second-Generation Analysis of Chromosome 22Genome Research, 2002
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- Splitting pairs: the diverging fates of duplicated genesNature Reviews Genetics, 2002
- Identification and Analysis of Over 2000 Ribosomal Protein Pseudogenes in the Human GenomeGenome Research, 2002
- Length distribution of long interspersed nucleotide elements (LINEs) and processed pseudogenes of human endogenous retroviruses: implications for retrotransposition and pseudogene detectionGene, 2002
- Evidence Suggesting That a Fifth of Annotated Caenorhabditis elegans Genes May Be PseudogenesGenome Research, 2002
- Dynamite: a flexible code generating language for dynamic programming methods used in sequence comparison.1997