Joint with Ryan Copus.
Measurements of inconsistency among decision makers are important for assessing the quality of governance, as well evaluating policy reforms. And yet, high quality measures of inconsistency in real-world institutions are rarely available. The core problem is that mean differences on outcomes between decision makers (disparity statistics) systematically understate inconsistency. Our contribution is twofold. First, we show how this downward bias can theoretically be eliminated by targeting a specific kind of heterogeneous treatment effect. Second, we demonstrate how machine learning can be used to optimally implement the theory on observational data. We leverage an original dataset of civil appeals in the Ninth Circuit to provide one of the first high quality measures of decision making inconsistency in the court.