Why high-error-rate random mutagenesis libraries are enriched in functional and improved proteins

Abstract
Recently, several groups have used error-prone polymerase chain reactions to construct mutant libraries containing up to 27 nucleotide mutations per gene on average, and reported a striking observation: although retention of protein function initially declines exponentially with mutations as has previously been observed, orders of magnitude more proteins remain viable at the highest mutation rates than this trend would predict. Mutant proteins having improved or novel activity were isolated disproportionately from these heavily mutated libraries, leading to the suggestion that distant regions of sequence space are enriched in useful cooperative mutations and that optimal mutagenesis should target these regions. If true, these claims have profound implications for laboratory evolution and for evolutionary theory. Here, we demonstrate that properties of the polymerase chain reaction can explain these results and, consequently, that average protein viability indeed decreases exponentially with mutational distance at all error rates. We show that high-error-rate mutagenesis may be useful in certain cases, though for very different reasons than originally proposed, and that optimal mutation rates are inherently protocol-dependent. Our results allow optimal mutation rates to be found given mutagenesis conditions and a protein of known mutational robustness.