Optimization of the BLASTN substitution matrix for prediction of non-specific DNA microarray hybridization
Open Access
- 4 December 2009
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 38 (4), e27
- https://doi.org/10.1093/nar/gkp1116
Abstract
DNA microarray measurements are susceptible to error caused by non-specific hybridization between a probe and a target (cross-hybridization), or between two targets (bulk-hybridization). Search algorithms such as BLASTN can quickly identify potentially hybridizing sequences. We set out to improve BLASTN accuracy by modifying the substitution matrix and gap penalties. We generated gene expression microarray data for samples in which 1 or 10% of the target mass was an exogenous spike of known sequence. We found that the 10% spike induced 2-fold intensity changes in 3% of the probes, two-third of which were decreases in intensity likely caused by bulk-hybridization. These changes were correlated with similarity between the spike and probe sequences. Interestingly, even very weak similarities tended to induce a change in probe intensity with the 10% spike. Using this data, we optimized the BLASTN substitution matrix to more accurately identify probes susceptible to non-specific hybridization with the spike. Relative to the default substitution matrix, the optimized matrix features a decreased score for A–T base pairs relative to G–C base pairs, resulting in a 5–15% increase in area under the ROC curve for identifying affected probes. This optimized matrix may be useful in the design of microarray probes, and in other BLASTN-based searches for hybridization partners.Keywords
This publication has 24 references indexed in Scilit:
- In situ analysis of cross-hybridisation on microarrays and the inference of expression correlationBMC Bioinformatics, 2007
- Probe selection for DNA microarrays using OligoWizNature Protocols, 2007
- Replacing cRNA targets with cDNA reduces microarray cross-hybridizationNature Biotechnology, 2006
- Hybridization interactions between probesets in short oligo microarrays lead to spurious correlationsBMC Bioinformatics, 2006
- Reliability and reproducibility issues in DNA microarray measurementsTrends in Genetics, 2006
- Partition function and base pairing probabilities of RNA heterodimersAlgorithms for Molecular Biology, 2006
- Detecting false expression signals in high-density oligonucleotide arrays by an in silico approachGenomics, 2004
- Sensitivity, Specificity, and the Hybridization Isotherms of DNA ChipsBiophysical Journal, 2004
- Mfold web server for nucleic acid folding and hybridization predictionNucleic Acids Research, 2003
- Basic local alignment search toolJournal of Molecular Biology, 1990