Abstract
In log-linear capture-recapture approaches to population size, the method of model selection may have a major effect upon the estimate. In addition, the estimate may also be very sensitive if certain cells are null or very sparse, even with the use of multiple sources. The authors evaluated 1) various approaches to the issue of model uncertainty and 2) a small sample correction for three or more sources recently proposed by Hook and Regal. The authors compared the estimates derived using 1) three different information criteria that included Akaike's Information Criterion (AIC) and two alternative formulations of the Bayesian Information Criterion (BIC), one proposed by Draper (“two pi”) and one by Schwarz (“not two pi”); 2) two related methods of weighting estimates associated with models; 3) the independent model; and 4) the saturated model, with the known totals in 20 different populations studied by five separate groups of investigators. For each method, we also compared the estimate derived with or without the proposed small sample correction. At least in these data sets, the use of AIC appeared on balance to be preferable. The BIC formulation suggested by Draper appeared slightly preferable to that suggested by Schwarz. Adjustment for model uncertainty appears to improve results slightly. The proposed small sample correction appeared to diminish relative log bias but only when sparse cells were present. Otherwise, its use tended to increase relative log bias. Use of the saturated model (with or without the small sample correction) appears to be optimal if the associated interval is not uselessly large, and if one can plausibly exclude an all-source interaction. All other approaches led to an estimate that was too low by about one standard deviation. Am J Epidemiol 1997; 145: 1138–44.