Estimates of the Effect of Natural Selection on Protein-Coding Content

Open Access

8 October 2009

journal article
research article
Published by Oxford University Press (OUP) in Molecular Biology and Evolution

Vol. 27 (3), 726-734
https://doi.org/10.1093/molbev/msp232

Abstract

Analysis of natural selection is key to understanding many core biological processes, including the emergence of competition, cooperation, and complexity, and has important applications in the targeted development of vaccines. Selection is hard to observe directly but can be inferred from molecular sequence variation. For protein-coding nucleotide sequences, the ratio of nonsynonymous to synonymous substitutions (ω) distinguishes neutrally evolving sequences (ω = 1) from those subjected to purifying (ω < 1) or positive Darwinian (ω > 1) selection. We show that current models used to estimate ω are substantially biased by naturally occurring sequence compositions. We present a novel model that weights substitutions by conditional nucleotide frequencies and which escapes these artifacts. Applying it to the genomes of pathogens causing malaria, leprosy, tuberculosis, and Lyme disease gave significant discrepancies in estimates with ∼10–30% of genes affected. Our work has substantial implications for how vaccine targets are chosen and for studying the molecular basis of adaptive evolution.

Keywords

This publication has 34 references indexed in Scilit:

Hotspots of Biased Nucleotide Substitutions in Human Genes
PLoS Biology, 2009
Pitfalls of the most commonly used models of context dependent substitution
Biology Direct, 2008
Ensembl 2009
Nucleic Acids Research, 2008
Estimating Translational Selection in Eukaryotic Genomes
Molecular Biology and Evolution, 2008
Comparative genomics of the neglected human malaria parasite Plasmodium vivax
Nature, 2008
Mutations of Different Molecular Origins Exhibit Contrasting Patterns of Regional Substitution Rate Variation
PLoS Computational Biology, 2008
Positively Selected Codons in Immune-Exposed Loops of the Vaccine Candidate OMP-P1 of Haemophilus influenzae
Journal of Molecular Evolution, 2007
PyCogent: a toolkit for making sense from sequence
Genome Biology, 2007
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes
Genome Research, 2005
Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection
Nature, 1988

Cited by 39 articles