The model organism as a system: integrating 'omics' data sets

Abstract
Many 'omics' data sets are becoming available for various model organisms that can be used to describe many aspects of the cell for a given time and/or condition. They can be broadly classified as components data, which describe the specific molecular contents of the cell; interactions data, which detail the connectivity between cellular components; or functional-states data, which reveal the overall behaviour, or phenotype, of the cell or system in response to genetic and/or environmental perturbations. Even though each of these genome-scale data types can be powerful on their own, researchers are gaining valuable additional insights into cellular phenomena through the integration of 'omics' data sets. The computational tools that have been developed for integrating 'omics' data generally tackle three specific tasks: first, identifying the network scaffold by delineating the connections that exist between cellular components; second, decomposing the network scaffold into its constituent parts in an attempt to understand the overall network structure; and third, developing cellular or system models to simulate and predict the network behaviour that gives rise to particular cellular phenotypes. In addition to the development of methods, many researchers are using 'omics' integration to drive studies that are aimed at delineating systems-wide behaviour. For example, many efforts have been devoted to using genome-scale data integration to completely map the cellular pathways that are responsible for the observed cellular responses to environmental perturbations or developmental events. In some cases, these studies have also led to the development of biomarkers, or patterns of cellular-component expression that are associated with medical disorders, such as various cancers. Researchers are also using omics integration to address fundamental evolutionary questions that were previously beyond the scope and scale of standard techniques. Specifically, omics data-integration techniques have been used to examine cellular differences that are associated with speciation, and other studies have used them to study selective pressures that are likely to have arisen due to cellular-network structure. The integration of omics data has primarily affected basic research efforts so far. Increasingly, however, this strategy is taking on significant roles in clinically relevant areas, as shown by its stimulation of the fields of toxicogenomics and nutrigenomics, which are applying genome-scale technologies and integrative analyses to problems in toxicology and nutrition, respectively. Even though many challenges related to data quality and accessibility remain, researchers continue to work towards meeting the ultimate future goals of employing these strategies to drug-development applications and in personalized medicine.