Efficient progressive sampling

Abstract
Having access to massive amounts of data does not necessarily imply that inductionalgorithms must use them all. Samples often provide the same accuracy with far lesscomputational cost. However, the correct sample size is rarely obvious. We analyzemethods for progressive sampling---starting with small samples and progressively increasingthem as long as model accuracy improves. We show that a simple, geometricsampling schedule is efficient in an asymptotic sense. We then explore the notion...

This publication has 5 references indexed in Scilit: