Big Data, Machine Learning, and the Credibility Revolution in Empirical Legal Studies

Joint with Ryan Copus and Hannah Laqueur.

2019. Law as Data: Computation and the Future of Legal Analysis. Edited by Michael A. Livermore and Daniel N. Rockmore. Santa Fe, NM: SFI Press.

The credibility revolution changed empirical research. Post-revolution, researchers have spent more effort to design their studies around sources of random or as-if random variation, instead of statistically modeling the world to make causal inferences. Most of us have come to agree that, in the words of Donald Rubin, “design trumps analysis.” But there is now a surging interest in a particularly powerful tool of quantitative analysis: machine learning. This chapter addresses the place of machine learning in a post-credibility revolution landscape. We make four main points. First, design still trumps analysis. Second, even design-committed researchers should not ignore machine learning: it can be used in service of design-based studies to make causal estimates less variable, less biased, and more heterogeneous. Third, there are important policy-relevant prediction problems for which machine learning is particularly valuable, yet even with prediction questions, a focus on design is still essential. Fourth, the predictive power of machine learning can be leveraged for descriptive research. Where applicable, we illustrate these points using examples drawn from real-world research.

Download drafts and/or supplemental materials: