Integrating text mining into the MGI biocuration workflow
Open Access
- 1 January 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Database: The Journal of Biological Databases and Curation
- Vol. 2009, bap019
- https://doi.org/10.1093/database/bap019
Abstract
A major challenge for functional and comparative genomics resource development is the extraction of data from the biomedical literature. Although text mining for biological data is an active research field, few applications have been integrated into production literature curation systems such as those of the model organism databases (MODs). Not only are most available biological natural language (bioNLP) and information retrieval and extraction solutions difficult to adapt to existing MOD curation workflows, but many also have high error rates or are unable to process documents available in those formats preferred by scientific journals.Keywords
This publication has 18 references indexed in Scilit:
- Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO) Cellular Component curationBMC Bioinformatics, 2009
- Reflect: augmented browsing for the life scientistNature Biotechnology, 2009
- OnTheFly: a tool for automated document-based text annotation, data linking and network generationBioinformatics, 2009
- The Mouse Genome Database genotypes::phenotypesNucleic Acids Research, 2009
- Introducing meta-services for biomedical information extractionGenome Biology, 2008
- Overview of BioCreative II gene normalizationGenome Biology, 2008
- Semantically linking and browsing PubMed abstracts with gene ontologyBMC Genomics, 2008
- iHOP web servicesNucleic Acids Research, 2007
- Overview of BioCreAtIvE task 1B: normalized gene listsBMC Bioinformatics, 2005
- Improving the performance of dictionary-based approaches in protein name recognitionJournal of Biomedical Informatics, 2004