Combined Expression Trait Correlations and Expression Quantitative Trait Locus Mapping

Abstract
Coordinated regulation of gene expression levels across a series of experimental conditions provides valuable information about the functions of correlated transcripts. The consideration of gene expression correlation over a time or tissue dimension has proved valuable in predicting gene function. Here, we consider correlations over a genetic dimension. In addition to identifying coregulated genes, the genetic dimension also supplies us with information about the genomic locations of putative regulatory loci. We calculated correlations among approximately 45,000 expression traits derived from 60 individuals in an F2 sample segregating for obesity and diabetes. By combining the correlation results with linkage mapping information, we were able to identify regulatory networks, make functional predictions for uncharacterized genes, and characterize novel members of known pathways. We found evidence of coordinate regulation of 174 G protein–coupled receptor protein signaling pathway expression traits. Of the 174 traits, 50 had their major LOD peak within 10 cM of a locus on Chromosome 2, and 81 others had a secondary peak in this region. We also characterized a Riken cDNA clone that showed strong correlation with stearoyl-CoA desaturase 1 expression. Experimental validation confirmed that this clone is involved in the regulation of lipid metabolism. We conclude that trait correlation combined with linkage mapping can reveal regulatory networks that would otherwise be missed if we studied only mRNA traits with statistically significant linkages in this small cross. The combined analysis is more sensitive compared with linkage mapping alone. In order to annotate gene function and identify potential members of regulatory networks, the authors explore correlation of expression profiles across a genetic dimension, namely genotypes segregating in a panel of 60 F2 mice derived from a cross used to explore diabetes in obese mice. They first identified 6,016 seed transcripts for which they observe that the gene expression is linked to a particular region of the genome. Then they searched for transcripts whose expression is highly correlated with the seed transcripts and tested for enrichment of common biological functions among the lists of correlated transcripts. They found and explored the properties of 1,341 sets of transcripts that share a particular “gene ontology” term. Thirty-eight seeds in the G protein–coupled receptor protein signaling pathway were correlated with 174 transcripts, all of which are also annotated as G protein–coupled receptor protein signaling pathway and 131 of which share a regulatory locus on Chromosome 2. The authors note many of these findings would have been missed by simple expression quantitative trait loci analysis without the correlation step. The approach was used to identify a common set of genes involved in lipid metabolism.