Adaptive classifiers for dicentric chromosomes

Abstract
Classification of dicentric chromosomes in a practical automatic screening system comprises three stages. The first generates plausible centromere candidates from each chromosome in an automatically segmented metaphase, and uses contextual knowledge to generate distributions of "probably true" and "probably false" centromeres, thus adapting to the conditions within a particular metaphase. The second stage classifier uses these distributions to re-classify the candidates as centromeres or non-centromeres. From this classification, likely dicentrics are found by counting centromeres; a third classifier attempts to reject false positives among the likely dicentric chromosomes, by comparing the feature values of the proposed centromeres of a chromosome and rejecting chromosomes for which these values do not satisfy certain similarity criteria. The second stage classifier may be a simple box classifier, or may use a variety of parametric Bayesian methods. The performance of these alternatives has been tested both on reference data sets comprising about 600 metaphases, and on larger data sets when embedded in a practical fully automatic dicentric pre-screening system. When operating parameters were such that a similar number of true positives were found by both classifiers, the Bayesian classifier produced about half as many false positive errors as the box classifier, with the final false positive rate being in the region of one candidate dicentric chromosome in every four cells.