Overfitting revisited: an information-theoretic approach to simplifying discrimination trees
- 1 July 1994
- journal article
- research article
- Published by Taylor & Francis in Journal of Experimental & Theoretical Artificial Intelligence
- Vol. 6 (3), 289-302
- https://doi.org/10.1080/09528139408953790
Abstract
This paper describes a method of simplifying inductively generated discrimination trees using a measure of tree quality based on the principle of information economy, which takes into account both the size of the tree and the size of the outcome data after (notional) encoding by that tree. Results of testing this method on a selection of data sets show that it has some practical advantages over previously used techniques for tree-pruning. Some of the theoretical implications of the present method are also discussed.Keywords
This publication has 4 references indexed in Scilit:
- The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural NetworksNeural Computation, 1990
- Inferring decision trees using the minimum description lenght principleInformation and Computation, 1989
- Simplifying decision treesInternational Journal of Man-Machine Studies, 1987
- Language acquisition, data compression and generalizationLanguage & Communication, 1982