A program to implement a search method for identification of clinical subgroups
- 30 December 1995
- journal article
- research article
- Published by Wiley in Statistics in Medicine
- Vol. 14 (24), 2645-2659
- https://doi.org/10.1002/sim.4780142405
Abstract
A procedure for identifying subgroups that are homogeneous with respect to an outcome variable is described. The method, search partition analysis (SPAN), is formulated in terms of a numeric outcome variable y and a set of predictors, explanatory variables or risk factors x = x1, x2, …, xp. The objective is to split observations into two groups by a binary partition, specified using Boolean expressions of the predictors, x, such that y is as homogeneous as possible in the resultant groups; uniformly ‘low’ in one and uniformly ‘high’ in the other. Subgroups within each of the two groups can be identified from the Boolean expressions. SPAN implements a search for the ‘best’ partition from among a class of regular Boolean expressions. Features of the method are described, including how to measure partition homogeneity, complexity penalizing, search strategies and subgroup definition and representation. The approach is illustrated with analyses of predictors of low birth weight and predictors of impaired glucose tolerance for screening purposes.Keywords
This publication has 16 references indexed in Scilit:
- Overfitting revisited: an information-theoretic approach to simplifying discrimination treesJournal of Experimental & Theoretical Artificial Intelligence, 1994
- A method of choosing multiway partitions for classification and decision treesJournal of Applied Statistics, 1991
- Projection of a binary criterion into a model of hierarchical classesPsychometrika, 1990
- Maximizing the predictive value of production rulesArtificial Intelligence, 1990
- Tree-Structured Classification via Generalized Discriminant AnalysisJournal of the American Statistical Association, 1988
- Learning decision listsMachine Learning, 1987
- Partitioning methods for classification and decision making in medicineStatistics in Medicine, 1986
- BEAGLE—A DARWINIAN APPROACH TO PATTERN RECOGNITIONKybernetes, 1981
- An Exploratory Technique for Investigating Large Quantities of Categorical DataJournal of the Royal Statistical Society Series C: Applied Statistics, 1980
- Computer-aided prognosis. II. Development of a prognostic algorithmArchives of Internal Medicine, 1971