Dipeptide frequencies in proteins and the CpG deficiency in vertebrate DNA
- 1 December 1972
- journal article
- Published by Springer Nature in Journal of Molecular Evolution
- Vol. 1 (4), 315-325
- https://doi.org/10.1007/bf01653960
Abstract
Analysis of vertebrate protein sequences totalling 4040 residues shows that amino acids with a high proportion of codons ending in C occur with significantly reduced frequency before amino acids whose codons start with G. This effect is not shown by “control” bacterial protein sequences. The consequent implication of shortage of XXC. GXX codon pairs in vertebrate messenger RNA is discussed in relation to the extreme rarity of the base doublet CpG in vertebrate DNA.Keywords
This publication has 28 references indexed in Scilit:
- Complete amino acid sequence of bovine neurophysin IIBiochemical and Biophysical Research Communications, 1971
- The amino acid sequence of the major parvalbumin from hake muscleBiochemical and Biophysical Research Communications, 1971
- Studies on phospholipase A and its zymogen from porcine pancreas: I. The complete amino acid sequenceBiochimica et Biophysica Acta (BBA) - Protein Structure, 1970
- Horse Liver Alcohol DehydrogenaseEuropean Journal of Biochemistry, 1970
- The primary structure of bovine adrenodoxinBiochemical and Biophysical Research Communications, 1970
- Non-Darwinian EvolutionScience, 1969
- The characterization of amino acid sequences in proteins by statistical methodsJournal of Theoretical Biology, 1968
- Structure of a Mammalian Serine tRNANature, 1968
- Nucleotide Sequence of KB Cell 5 S RNAScience, 1967
- Formal analysis of protein sequences: I. Specific long-range constraints in pair associations of amino acidsJournal of Theoretical Biology, 1967