Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
Top Cited Papers
- 6 June 2018
- journal article
- research article
- Published by Informa UK Limited in Journal of the American Statistical Association
- Vol. 113 (523), 1228-1242
- https://doi.org/10.1080/01621459.2017.1319839
Abstract
Many scientific and engineering challenges—ranging from personalized medicine to customized marketing recommendations—require an understanding of treatment effect heterogeneity. In this article, we develop a nonparametric causal forest for estimating heterogeneous treatment effects that extends Breiman’s widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.Keywords
Other Versions
This publication has 55 references indexed in Scilit:
- From concepts, theory, and evidence of heterogeneity of treatment effects to methodological approaches: a primerBMC Medical Research Methodology, 2012
- Modeling Heterogeneous Treatment Effects in Survey Experiments with Bayesian Additive Regression TreesPublic Opinion Quarterly, 2012
- Bayesian Nonparametric Modeling for Causal InferenceJournal of Computational and Graphical Statistics, 2011
- Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regressionJournal of Clinical Epidemiology, 2010
- Model-Based Recursive PartitioningJournal of Computational and Graphical Statistics, 2008
- Efficient Estimation of Average Treatment Effects Using the Estimated Propensity ScoreEconometrica, 2003
- The central role of the propensity score in observational studies for causal effectsBiometrika, 1983
- The Jackknife Estimate of VarianceThe Annals of Statistics, 1981
- Some Comments onCpTechnometrics, 1973
- A Class of Statistics with Asymptotically Normal DistributionThe Annals of Mathematical Statistics, 1948