Differential Distribution of Simple Sequence Repeats in Eukaryotic Genome Sequences
Top Cited Papers
Open Access
- 1 July 2001
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 18 (7), 1161-1167
- https://doi.org/10.1093/oxfordjournals.molbev.a003903
Abstract
Complete chromosome/genome sequences available from humans, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, and Saccharomyces cerevisiae were analyzed for the occurrence of mono-, di-, tri-, and tetranucleotide repeats. In all of the genomes studied, dinucleotide repeat stretches tended to be longer than other repeats. Additionally, tetranucleotide repeats in humans and trinucleotide repeats in Drosophila also seemed to be longer. Although the trends for different repeats are similar between different chromosomes within a genome, the density of repeats may vary between different chromosomes of the same species. The abundance or rarity of various di- and trinucleotide repeats in different genomes cannot be explained by nucleotide composition of a sequence or potential of repeated motifs to form alternative DNA structures. This suggests that in addition to nucleotide composition of repeat motifs, characteristic DNA replication/repair/recombination machinery might play an important role in the genesis of repeats. Moreover, analysis of complete genome coding DNA sequences of Drosophila, C. elegans, and yeast indicated that expansions of codon repeats corresponding to small hydrophilic amino acids are tolerated more, while strong selection pressures probably eliminate codon repeats encoding hydrophobic and basic amino acids. The locations and sequences of all of the repeat loci detected in genome sequences and coding DNA sequences are available at http://www.ncl-india.org/ssr and could be useful for further studies.Keywords
This publication has 36 references indexed in Scilit:
- The Genome Sequence of Drosophila melanogasterScience, 2000
- Amino acid repeat patterns in protein sequences: Their diversity and structural‐functional implicationsProtein Science, 2000
- Genome Sequence of the Nematode C. elegans : A Platform for Investigating BiologyScience, 1998
- Life with 6000 GenesScience, 1996
- Distribution and variability of trinucleotide repeats in the genome of the yeast Saccharomyces cerevisiaeGene, 1996
- Frequency of microsatellite sequences in rice (Oryza sativaL.)Genome, 1995
- TRINUCLEOTIDE REPEAT EXPANSION AND HUMAN DISEASEAnnual Review of Genetics, 1995
- PCR-amplified microsatellites as markers in plant geneticsThe Plant Journal, 1993
- Toward a Unified Approach to Genetic Mapping of Eukaryotes Based on Sequence Tagged Microsatellite SitesNature Biotechnology, 1990
- Cryptic simplicity in DNA is a major source of genetic variationNature, 1986