Centring in regression analyses: a strategy to prevent errors in statistical inference
Top Cited Papers
- 1 August 2004
- journal article
- research article
- Published by Wiley in International Journal of Methods in Psychiatric Research
- Vol. 13 (3), 141-151
- https://doi.org/10.1002/mpr.170
Abstract
Regression analyses are perhaps the most widely used statistical tools in medical research. Centring in regression analyses seldom appears to be covered in training and is not commonly reported in research papers. Centring is the process of selecting a reference value for each predictor and coding the data based on that reference value so that each regression coefficient that is estimated and tested is relevant to the research question. Using non‐centred data in regression analysis, which refers to the common practice of entering predictors in their original score format, often leads to inconsistent and misleading results. There is very little cost to unnecessary centring, but the costs of not centring when it is necessary can be major. Thus, it would be better always to centre in regression analyses. We propose a simple default centring strategy: (1) code all binary independent variables +1/2; (2) code all ordinal independent variables as deviations from their median; (3) code all ‘dummy variables’ for categorical independent variables having m possible responses as 1−1/m and −1/m instead of 1 and 0; (4) compute interaction terms from centred predictors. Using this default strategy when there is no compelling evidence to centre protects against most errors in statistical inference and its routine use sensitizes users to centring issues. Copyright © 2004 Whurr Publishers Ltd.Keywords
This publication has 9 references indexed in Scilit:
- Mediators and Moderators of Treatment Effects in Randomized Clinical TrialsArchives of General Psychiatry, 2002
- How Do Risk Factors Work Together? Mediators, Moderators, and Independent, Overlapping, and Proxy Risk FactorsAmerican Journal of Psychiatry, 2001
- Mean Centering in Moderated Multiple Regression: Much Ado about NothingEducational and Psychological Measurement, 1998
- Enhancing the Outcomes of Low-Birth-Weight, Premature InfantsJAMA, 1990
- Frequency of Selecting Noise Variables in Subset Regression Analysis: A Simulation StudyThe American Statistician, 1987
- The results of logistic analyses when the variables are highly correlated: An empirical example using diet and CHD incidenceJournal of Chronic Diseases, 1984
- Nonorthogonal analysis of variance—once again.Psychological Bulletin, 1980
- Partialed products are interactions; partialed powers are curve components.Psychological Bulletin, 1978
- Some problems in the nonorthogonal analysis of variance.Psychological Bulletin, 1974