A program to implement a search method for identification of clinical subgroups

30 December 1995

journal article
research article
Published by Wiley in Statistics in Medicine

Vol. 14 (24), 2645-2659
https://doi.org/10.1002/sim.4780142405

Abstract

A procedure for identifying subgroups that are homogeneous with respect to an outcome variable is described. The method, search partition analysis (SPAN), is formulated in terms of a numeric outcome variable y and a set of predictors, explanatory variables or risk factors x = x₁, x₂, …, x_p. The objective is to split observations into two groups by a binary partition, specified using Boolean expressions of the predictors, x, such that y is as homogeneous as possible in the resultant groups; uniformly ‘low’ in one and uniformly ‘high’ in the other. Subgroups within each of the two groups can be identified from the Boolean expressions. SPAN implements a search for the ‘best’ partition from among a class of regular Boolean expressions. Features of the method are described, including how to measure partition homogeneity, complexity penalizing, search strategies and subgroup definition and representation. The approach is illustrated with analyses of predictors of low birth weight and predictors of impaired glucose tolerance for screening purposes.

Keywords

This publication has 16 references indexed in Scilit:

Overfitting revisited: an information-theoretic approach to simplifying discrimination trees
Journal of Experimental & Theoretical Artificial Intelligence, 1994
A method of choosing multiway partitions for classification and decision trees
Journal of Applied Statistics, 1991
Projection of a binary criterion into a model of hierarchical classes
Psychometrika, 1990
Maximizing the predictive value of production rules
Artificial Intelligence, 1990
Tree-Structured Classification via Generalized Discriminant Analysis
Journal of the American Statistical Association, 1988
Learning decision lists
Machine Learning, 1987
Partitioning methods for classification and decision making in medicine
Statistics in Medicine, 1986
BEAGLE—A DARWINIAN APPROACH TO PATTERN RECOGNITION
Kybernetes, 1981
An Exploratory Technique for Investigating Large Quantities of Categorical Data
Journal of the Royal Statistical Society Series C: Applied Statistics, 1980
Computer-aided prognosis. II. Development of a prognostic algorithm
Archives of Internal Medicine, 1971

Cited by 8 articles