Data curation + process curation=data integration + science
- 11 July 2008
- journal article
- Published by Oxford University Press (OUP) in Briefings in Bioinformatics
- Vol. 9 (6), 506-517
- https://doi.org/10.1093/bib/bbn034
Abstract
In bioinformatics, we are familiar with the idea of curated data as a prerequisite for data integration. We neglect, often to our cost, the curation and cataloguing of the processes that we use to integrate and analyse our data. Programmatic access to services, for data and processes, means that compositions of services can be made that represent the in silico experiments or processes that bioinformaticians perform. Data integration through workflows depends on being able to know what services exist and where to find those services. The large number of services and the operations they perform, their arbitrary naming and lack of documentation, however, mean that they can be difficult to use. The workflows themselves are composite processes that could be pooled and reused but only if they too can be found and understood. Thus appropriate curation, including semantic mark-up, would enable processes to be found, maintained and consequently used more easily. This broader view on semantic annotation is vital for full data integration that is necessary for the modern scientific analyses in biology. This article will brief the community on the current state of the art and the current challenges for process curation, both within and without the Life Sciences.Keywords
This publication has 22 references indexed in Scilit:
- Graph-based analysis and visualization of experimental results with ONDEXBioinformatics, 2006
- e-Science and the VL-e ApproachLecture Notes in Computer Science, 2006
- Web services and workflow management for biological resourcesBMC Bioinformatics, 2005
- The Bioinformatics Links Directory: a Compilation of Molecular Biology Web ServersNucleic Acids Research, 2005
- Atlas – a data warehouse for integrative bioinformaticsBMC Bioinformatics, 2005
- Exploring Williams–Beuren syndrome using myGridBioinformatics, 2004
- UTOPIA—user‐friendly tools for operating informatics applicationsComparative and Functional Genomics, 2004
- The Generic Genome Browser: A Building Block for a Model Organism System DatabaseGenome Research, 2002
- Creating a bioinformatics nationNature, 2002
- A classification of tasks in bioinformaticsBioinformatics, 2001