MiDReG: A method of mining developmentally regulated genes using Boolean implications

Abstract
We present a method termed mining developmentally regulated genes (MiDReG) to predict genes whose expression is either activated or repressed as precursor cells differentiate. MiDReG does not require gene expression data from intermediate stages of development. MiDReG is based on the gene expression patterns between the initial and terminal stages of the differentiation pathway, coupled with “if-then” rules (Boolean implications) mined from large-scale microarray databases. MiDReG uses two gene expression-based seed conditions that mark the initial and the terminal stages of a given differentiation pathway and combines the statistically inferred Boolean implications from these seed conditions to identify the relevant genes. The method was validated by applying it to B-cell development. The algorithm predicted 62 genes that are expressed after the KIT+ progenitor cell stage and remain expressed through CD19+ and AICDA+ germinal center B cells. qRT-PCR of 14 of these genes on sorted B-cell progenitors confirmed that the expression of 10 genes is indeed stably established during B-cell differentiation. Review of the published literature of knockout mice revealed that of the predicted genes, 63.4% have defects in B-cell differentiation and function and 22% have a role in the B cell according to other experiments, and the remaining 14.6% are not characterized. Therefore, our method identified novel gene candidates for future examination of their role in B-cell development. These data demonstrate the power of MiDReG in predicting functionally important intermediate genes in a given developmental pathway that is defined by a mutually exclusive gene expression pattern.