The effects of model selection on confidence intervals for the size of a closed population

1 May 1991

journal article
research article
Published by Wiley in Statistics in Medicine

Vol. 10 (5), 717-721
https://doi.org/10.1002/sim.4780100506

Abstract

One encounters in the literature estimates of some rates of genetic and congenital disorders based on log-linear methods to model possible interactions among sources. Often the analyst chooses the simplest model consistent with the data for estimation of the size of a closed population and calculates confidence intervals on the assumption that this simple model is correct. However, despite an apparent excellent fit of the data to such a model, we note here that the resulting confidence intervals may well be misleading in that they can fail to provide an adequate coverage probability. We illustrate this with a simulation for a hypothetical population based on data reported in the literature from three sources. The simulated nominal 95 per cent confidence intervals contained the modelled population size only 30 per cent of the time. Only if external considerations justify the assumption of plausible interactions of sources would use of the simpler model's interval be justified.

Keywords

This publication has 7 references indexed in Scilit:

The Impact of Model Selection on Inference in Linear Regression
The American Statistician, 1990
Exact and asymptotic inference for the size of a population
Biometrika, 1987
Goodness‐of‐fit based confidence intervals for estimates of the size of a closed population
Statistics in Medicine, 1984
Generalized Linear Models
Published by Springer Nature ,1983
USE OF BERNOULLI CENSUS AND LOG-LINEAR METHODS FOR ESTIMATING THE PREVALENCE OF SPINA BIFIDA IN LIVEBIRTHS AND THE COMPLETENESS OF VITAL RECORD REPORTS IN NEW YORK STATE
American Journal of Epidemiology, 1980
Capture-recapture methods for assessing the completeness of case ascertainment when using multiple information sources
Journal of Chronic Diseases, 1974

Cited by 50 articles