Modeling Reporting Delays and Reporting Corrections in Cancer Registry Data
- 1 March 2005
- journal article
- Published by Taylor & Francis in Journal of the American Statistical Association
- Vol. 100 (469), 61-70
- https://doi.org/10.1198/016214504000001899
Abstract
The Surveillance, Epidemiology, and End Results (SEER) program of the National Cancer Institute is an authoritative source of cancer incidence statistics in the United States. The SEER program is a consortium of population-based cancer registries from different areas of the country. Each registry is charged with collecting data on all cancers that occur within its geographic area. As with any disease registry, there is a delay between the time that the disease (cancer) is first diagnosed and the time that it is reported to the registry. The SEER program has allowed for reporting delays of up to 19-months before releasing data for public use. Nevertheless, additional cases are discovered after the 19-month delay, and these cases are added in subsequent releases of the data. Further, any errors discovered are corrected in subsequent releases. Such reporting delays and corrections typically lead to underestimation of the cancer incidence rates in recent diagnosis years, making it difficult to monitor trends. In this article we study models that account for reporting delays and corrections in predicting eventual cancer counts for a diagnosis year from the preliminary counts. Previous models of this type have been studied, especially as applied to AIDS registries. We offer several additions to existing models. First, we explicitly model the reporting corrections. Second, we model the delay distribution with very general models, combining aspects of previous nonparametric-like models (i.e., models that have a separate parameter for each delay time) with more parametric models. Third, we allow random reporting-year effects in the model. Practical issues of model selection and how the data are classified are also discussed, particularly how the definition of a reporting correction may change depending on how subpopulations are defined. An example with SEER melanoma data is studied in detail.Keywords
This publication has 2 references indexed in Scilit:
- Representativeness of the Surveillance, Epidemiology, and End Results Program Data: Recent Trends in Cancer Mortality RatesJNCI Journal of the National Cancer Institute, 1992
- Reporting Delays and the Incidence of AIDSJournal of the American Statistical Association, 1990