Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations
Open Access
- 2 June 2006
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 7 (1), 1-14
- https://doi.org/10.1186/1471-2105-7-276
Abstract
Microarrays measure the binding of nucleotide sequences to a set of sequence specific probes. This information is combined with annotation specifying the relationship between probes and targets and used to make inferences about transcript- and, ultimately, gene expression. In some situations, a probe is capable of hybridizing to more than one transcript, in others, multiple probes can target a single sequence. These 'multiply targeted' probes can result in non-independence between measured expression levels. An analysis of these relationships for Affymetrix arrays considered both the extent and influence of exact matches between probe and transcript sequences. For the popular HGU133A array, approximately half of the probesets were found to interact in this way. Both real and simulated expression datasets were used to examine how these effects influenced the expression signal. It was found not only to lead to increased signal strength for the affected probesets, but the major effect is to significantly increase their correlation, even in situations when only a single probe from a probeset was involved. By building a network of probe-probeset-transcript relationships, it is possible to identify families of interacting probesets. More than 10% of the families contain members annotated to different genes or even different Unigene clusters. Within a family, a mixture of genuine biological and artefactual correlations can occur. Multiple targeting is not only prevalent, but also significant. The ability of probesets to hybridize to more than one gene product can lead to false positives when analysing gene expression. Comprehensive annotation describing multiple targeting is required when interpreting array data.Keywords
This publication has 28 references indexed in Scilit:
- Alternative mapping of probes to genes for Affymetrix chipsBMC Bioinformatics, 2004
- Increased measurement accuracy for sequence-verified microarray probesPhysiological Genomics, 2004
- Gene expression profiles and risk stratification in childhood acute lymphoblastic leukemia.2004
- Kaposi sarcoma herpesvirus–induced cellular reprogramming contributes to the lymphatic endothelial gene expression in Kaposi sarcomaNature Genetics, 2004
- Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurementsNucleic Acids Research, 2004
- LGL: Creating a Map of Protein Function with an Algorithm for Visualizing Very Large Biological NetworksJournal of Molecular Biology, 2004
- An Overview of EnsemblGenome Research, 2004
- A gene atlas of the mouse and human protein-encoding transcriptomesProceedings of the National Academy of Sciences, 2004
- The Gene Ontology (GO) database and informatics resourceNucleic Acids Research, 2004
- Reproducibility of gene expression across generations of Affymetrix microarraysBMC Bioinformatics, 2003