Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks

Abstract
Eric Schadt and colleagues report the construction of yeast regulatory networks from multiple sources of large-scale functional genomic data, and show that a network constructed from the integration of genotypic, transcription factor binding site, and protein–protein interaction data is the most predictive. A key goal of biology is to construct networks that predict complex system behavior. We combine multiple types of molecular data, including genotypic, expression, transcription factor binding site (TFBS), and protein–protein interaction (PPI) data previously generated from a number of yeast experiments, in order to reconstruct causal gene networks. Networks based on different types of data are compared using metrics devised to assess the predictive power of a network. We show that a network reconstructed by integrating genotypic, TFBS and PPI data is the most predictive. This network is used to predict causal regulators responsible for hot spots of gene expression activity in a segregating yeast population. We also show that the network can elucidate the mechanisms by which causal regulators give rise to larger-scale changes in gene expression activity. We then prospectively validate predictions, providing direct experimental evidence that predictive networks can be constructed by integrating multiple, appropriate data types.