Inherited disorder phenotypes: controlled annotation and statistical analysis for knowledge mining from gene lists
Open Access
- 1 December 2005
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 6 (S4), S18
- https://doi.org/10.1186/1471-2105-6-s4-s18
Abstract
Background: Analysis of inherited diseases and their associated phenotypes is of great importance to gain knowledge of underlying genetic interactions and could ultimately give clinically useful insights into disease processes, including complex diseases influenced by multiple genetic loci. Nevertheless, to date few computational contributions have been proposed for this purpose, mainly due to lack of controlled clinical information easily accessible and structured for computational genome-wise analyses. To allow performing phenotype analyses of inherited disorder related genes we implemented new original modules withinGFINDerhttp://www.bioinformatics.polimi.it/GFINDer/, a Web system we previously developed that dynamically aggregates functional annotations of user uploaded gene lists and allows performing their statistical analysis and mining.Results: NewGFINDermodules allow annotating large numbers of user classified biomolecular sequence identifiers with morbidity and clinical information, classifying them according to genetic disease phenotypes and their locations of occurrence, and statistically analyzing the obtained classifications. To achieve this we exploited, normalized and structured the information present in textual form in the Clinical Synopsis sections of the Online Mendelian Inheritance in Man (OMIM) databank. Such valuable information delineates numerous signs and symptoms accompanying many genetic diseases and it is divided into phenotype location categories, either by organ system or type of finding.Conclusion: Supporting phenotype analyses of inherited diseases and biomolecular functional evaluations,GFINDerfacilitates a genomic approach to the understanding of fundamental biological processes and complex cellular mechanisms underlying patho-physiological phenotypes.Keywords
This publication has 15 references indexed in Scilit:
- Teoria Statistica Delle Classi e Calcolo Delle ProbabilitàPublished by SAGE Publications ,2010
- Entrez Gene: gene-centered information at NCBINucleic Acids Research, 2004
- Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disordersNucleic Acids Research, 2004
- Automatic extraction of gene/protein biological functions from biomedical textBioinformatics, 2004
- Using literature-based discovery to identify disease candidate genesInternational Journal of Medical Informatics, 2004
- GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and miningNucleic Acids Research, 2004
- The Genetic Association DatabaseNature Genetics, 2004
- The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene OntologyNucleic Acids Research, 2004
- UniProt: the Universal Protein knowledgebaseNucleic Acids Research, 2004
- The Breast Cancer Gene Database: a collaborative information resourceOncogene, 1999