Determining the value of additional surrogate exposure data for improving the estimate of an odds ratio
- 15 December 1995
- journal article
- Published by Wiley in Statistics in Medicine
- Vol. 14 (23), 2581-2598
- https://doi.org/10.1002/sim.4780142307
Abstract
We consider the design of both cohort and case‐control studies in which an initial (‘stage 1’) sample of complete data on an error‐free disease indicator (D), a correct (‘gold‐standard’) dichotomous exposure measurement (X) and an error‐prone exposure measurement (Z) are available. We calculate the amount of additional information on the odds ratio relating D to X that one can obtain from a second (‘stage 2’) sample of measurements only on D and Z. If one allows for differential measurement error in Z, there is often little advantage in having more than four times as much data in stage 2 data as in stage 1. With the assumption that a non‐differential measurement error model is reasonable, larger amounts of stage 2 data can be useful. Simulations indicate that stage 1 samples of modest size (50 cases in case‐control studies and 50 failures in cohort studies) yield sufficiently reliable estimates of needed parameters to assist in determining an appropriate size for the stage 2 sample. These ideas apply in settings either where the amount of stage 1 data is limited and fixed by external constraints or where one has gathered stage 1 data in advance to avoid collecting superfluous stage 2 data.Keywords
This publication has 15 references indexed in Scilit:
- Testing hypotheses with binary data subject to misclassification errors: Analysis and experimental designBiometrika, 1991
- A review of methods for misclassified categorical data in epidemiologyStatistics in Medicine, 1989
- Precision of double sampling estimators for comparing two probabilitiesBiometrika, 1987
- Adjusting for Errors in Classification and Measurement in the Analysis of Partly and Purely Categorical DataJournal of the American Statistical Association, 1986
- Log-Linear Models for Doubly Sampled Categorical Data Fitted by the EM AlgorithmJournal of the American Statistical Association, 1985
- Log-Linear Models for Categorical Data With Misclassification and Double SamplingJournal of the American Statistical Association, 1979
- A Double Sampling Scheme for Estimating from Misclassified Multinomial Data with Applications to Sampling InspectionTechnometrics, 1972
- A Double Sampling Scheme for Estimating from Binomial Data with MisclassificationsJournal of the American Statistical Association, 1970