Analysis of distribution of bases in the coding sequences by a digrammatic technique

Open Access

25 November 1991

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 19 (22), 6313-6317
https://doi.org/10.1093/nar/19.22.6313

Abstract

The frequencies of occurrence of four bases in the first, second and third codon positions and in the total coding sequences have been calculated by the codon usage table published in 1990 by Ikemura et al. The distribution of frequencies are further analysed in detail by a graphic technique presented recently by us. Formulas expressing the frequencies of four bases in the first and second codon positions in terms of frequencies of amino acids have been given. It is shown by the graphic analysis that for 90 species, in the first codon position the purine bases are dominant and in most cases G is the most dominant base. In the second codon position A is the most dominant base, while G is the least dominant base. In the third codon position the G + C content varies from 0.1 to 0.9, keeping the A + C content equal to 1/2 and G content equal to that of C, approximately. If the frequencies for bases A, C, G and U in the total coding sequenses are denoted by a, c, g and u, respectively, it is found that the unequal formula: a² + c² + g² + u² < 1/3, is valid for each of the 90 species including the human and E.coli etc.

Keywords

This publication has 8 references indexed in Scilit:

Diagrammatic representation of the distribution of DNA bases and its applications
International Journal of Biological Macromolecules, 1991
Codon usage in plant genes
Nucleic Acids Research, 1989
Codon usage tabulated from the GenBank Genetic Sequence Data
Nucleic Acids Research, 1988
Translation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16 S rRNA nucleotide sequences
Journal of Molecular Biology, 1987
Codon usage tabulated from the GenBank genetic sequence data
Nucleic Acids Research, 1986
Codon usage and tRNA content in unicellular and multicellular organisms.
Molecular Biology and Evolution, 1985
Codon catalog usage is a genome strategy modulated for gene expressivity
Nucleic Acids Research, 1981
Codon catalog usage and the genome hypothesis
Nucleic Acids Research, 1980

Cited by 146 articles