Estimation of the incidence of stroke using a capture-recapture model including covariates

Abstract
Background Capture-recapture is often used to assess completeness of a register. However, the usual two-source model relies on assumptions of independence of sources and equality of capture probability which are rarely satisfied in epidemiology. An alternative is to include covariates in capture-recapture models. Methods We use capture-recapture models including covariates to estimate incidence of stroke in South London. We estimate ascertainment-adjusted age-standardized incidence rates, and calculate confidence intervals for incidence which allow for the uncertainty in estimation of the total number of cases. Results The crude capture-recapture model (including no covariates) underestimated the number of non-fatal strokes. Demographic and stroke severity variables were associated with the probability of capture. Including covariates led to more plausible results for fatal and non-fatal strokes, and suggested that the stroke register was 88% complete. Adjusting for under-ascertainment increased the estimated incidence from 1.31 (95% CI : 1.21–1.42) to 1.49 (95% CI : 0.38–2.60) per 1000 people. Conclusions Incidence and age-standardized incidence can be calculated using data from an incomplete register. However, sparse strata can lead to wide confidence intervals for adjusted rates. Cost-effectiveness of routine registers might be increased by using the combination of sources and covariates which most accurately estimates the total number of cases, rather than by aiming for 100% completeness.