Rapid screening for improved solubility of small human proteins produced as fusion proteins in Escherichia coli

Abstract
A prerequisite for structural genomics and related projects is to standardize the process of gene overexpression and protein solubility screening to enable automation for higher throughput. We have tested a methodology to rapidly subclone a large number of human genes and screen these for expression and protein solubility in Escherichia coli. The methodology, which can be partly automated, was used to compare the effect of six different N-terminal fusion proteins and an N-terminal 6*His tag. As a realistic test set we selected 32 potentially interesting human proteins with unknown structures and sizes suitable for NMR studies. The genes were transferred from cDNA to expression vectors using subcloning by recombination. The subcloning yield was 100% for 27 (of 32) genes for which a PCR fragment of correct size could be obtained. Of these, 26 genes (96%) could be overexpressed at detectable levels and 23 (85%) are detected in the soluble fraction with at least one fusion tag. We find large differences in the effects of fusion protein or tag on expression and solubility. In short, four of seven fusions perform very well, and much better than the 6*His tag, but individual differences motivate the inclusion of several fusions in expression and solubility screening. We also conclude that our methodology and expression vectors can be used for screening of genes for structural studies, and that it should be possible to obtain a large fraction of all NMR-sized and nonmembrane human proteins as soluble fusion proteins in E. coli.