Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: High-resolution annotation for microarrays
Open Access
- 29 March 2007
- journal article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 8 (1), 108
- https://doi.org/10.1186/1471-2105-8-108
Abstract
Background Extracting biological information from high-density Affymetrix arrays is a multi-step process that begins with the accurate annotation of microarray probes. Shortfalls in the original Affymetrix probe annotation have been described; however, few studies have provided rigorous solutions for routine data analysis. Results Using AceView, a comprehensive human transcript database, we have reannotated the probes by matching them to RNA transcripts instead of genes. Based on this transcript-level annotation, a new probe set definition was created in which every probe in a probe set maps to a common set of AceView gene transcripts. In addition, using artificial data sets we identified that a minimal probe set size of 4 is necessary for reliable statistical summarization. We further demonstrate that applying the new probe set definition can detect specific transcript variants contributing to differential expression and it also improves cross-platform concordance. Conclusion We conclude that our transcript-level reannotation and redefinition of probe sets complement the original Affymetrix design. Redefinitions introduce probe sets whose sizes may not support reliable statistical summarization; therefore, we advocate using our transcript-level mapping redefinition in a secondary analysis step rather than as a replacement. Knowing which specific transcripts are differentially expressed is important to properly design probe/primer pairs for validation purposes. For convenience, we have created custom chip-description-files (CDFs) and annotation files for our new probe set definitions that are compatible with Bioconductor, Affymetrix Expression Console or third party software.Keywords
This publication has 32 references indexed in Scilit:
- Comparison of Affymetrix GeneChip expression measuresBioinformatics, 2006
- An expression index for Affymetrix GeneChips based on the generalized logarithmBioinformatics, 2005
- Detecting false expression signals in high-density oligonucleotide arrays by an in silico approachGenomics, 2004
- NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteinsNucleic Acids Research, 2004
- Implementation of a gene expression index calculation method based on the PDNN modelBioinformatics, 2004
- The ENCODE (ENCyclopedia Of DNA Elements) ProjectScience, 2004
- Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray ExperimentsStatistical Applications in Genetics and Molecular Biology, 2004
- Evaluation of gene expression measurements from commercial microarray platformsNucleic Acids Research, 2003
- Exploration, normalization, and summaries of high density oligonucleotide array probe level dataBiostatistics, 2003
- Expression monitoring by hybridization to high-density oligonucleotide arraysNature Biotechnology, 1996