Nonlinear Dynamics of Nonsynonymous (dN) and Synonymous (dS) Substitution Rates Affects Inference of Selection

Open Access

1 January 2009

journal article
research article
Published by Oxford University Press (OUP) in Genome Biology and Evolution

Vol. 1, 308-319
https://doi.org/10.1093/gbe/evp030

Abstract

Selection modulates gene sequence evolution in different ways by constraining potential changes of amino acid sequences (purifying selection) or by favoring new and adaptive genetic variants (positive selection). The number of nonsynonymous differences in a pair of protein-coding sequences can be used to quantify the mode and strength of selection. To control for regional variation in substitution rates, the proportionate number of nonsynonymous differences (d_N) is divided by the proportionate number of synonymous differences (d_S). The resulting ratio (d_N/d_S) is a widely used indicator for functional divergence to identify particular genes that underwent positive selection. With the ever-growing amount of genome data, summary statistics like mean d_N/d_S allow gathering information on the mode of evolution for entire species. Both applications hinge on the assumption that d_S and mean d_S (∼branch length) are neutral and adequately control for variation in substitution rates across genes and across organisms, respectively. We here explore the validity of this assumption using empirical data based on whole-genome protein sequence alignments between human and 15 other vertebrate species and several simulation approaches. We find that d_N/d_S does not appropriately reflect the action of selection as it is strongly influenced by its denominator (d_S). Particularly for closely related taxa, such as human and chimpanzee, d_N/d_S can be misleading and is not an unadulterated indicator of selection. Instead, we suggest that inconsistencies in the behavior of d_N/d_S are to be expected and highlight the idea that this behavior may be inherent to taking the ratio of two randomly distributed variables that are nonlinearly correlated. New null hypotheses will be needed to adequately handle these nonlinear dynamics.

Keywords

This publication has 39 references indexed in Scilit:

Estimates of Positive Darwinian Selection Are Inflated by Errors in Sequencing, Annotation, and Alignment
Genome Biology and Evolution, 2009
Uncorrected Nucleotide Bias in mtDNA Can Mimic the Effects of Positive Darwinian Selection
Molecular Biology and Evolution, 2008
Recent developments in the MAFFT multiple sequence alignment program
Briefings in Bioinformatics, 2008
28-Way vertebrate alignment and conservation track in the UCSC Genome Browser
Genome Research, 2007
Accumulation of slightly deleterious mutations in mitochondrial protein-coding genes of large versus small mammals
Proceedings of the National Academy of Sciences, 2007
Evolution of an avian pigmentation gene correlates with a measure of sexual selection
Proceedings Of The Royal Society B-Biological Sciences, 2007
More genes underwent positive selection in chimpanzee evolution than in human evolution
Proceedings of the National Academy of Sciences, 2007
PAML 4: Phylogenetic Analysis by Maximum Likelihood
Molecular Biology and Evolution, 2007
Hearing silence: non-neutral evolution at synonymous sites in mammals
Nature Reviews Genetics, 2006
MUSCLE: multiple sequence alignment with high accuracy and high throughput
Nucleic Acids Research, 2004

Cited by 97 articles