Efficient progressive sampling
- 1 August 1999
- proceedings article
- Published by Association for Computing Machinery (ACM)
Abstract
Having access to massive amounts of data does not necessarily imply that inductionalgorithms must use them all. Samples often provide the same accuracy with far lesscomputational cost. However, the correct sample size is rarely obvious. We analyzemethods for progressive sampling---starting with small samples and progressively increasingthem as long as model accuracy improves. We show that a simple, geometricsampling schedule is efficient in an asymptotic sense. We then explore the notion...Keywords
This publication has 5 references indexed in Scilit:
- A Survey of Methods for Scaling Up Inductive AlgorithmsData Mining and Knowledge Discovery, 1999
- Rigorous learning curve bounds from statistical mechanicsMachine Learning, 1997
- The statistical mechanics of learning a ruleReviews of Modern Physics, 1993
- Depth-first iterative-deepeningArtificial Intelligence, 1985
- A theory of the learnableCommunications of the ACM, 1984