Comparisons of different treatments or production processes are the goals of a significant fraction of applied research. Unsurprisingly, two-sample problems play a main role in statistics through natural questions such as “Is the the new treatment significantly better than the old?” However, this is only partially answered by some of the usual statistical tools for this task. More importantly, often practitioners are not aware of the real meaning behind these statistical procedures. We analyze these troubles from the point of view of the order between distributions, the stochastic order, showing evidence of the limitations of the usual approaches, paying special attention to the classical comparison of means under the normal model. We discuss the unfeasibility of statistically proving stochastic dominance, but show that it is possible, instead, to gather statistical evidence to conclude that slightly relaxed versions of stochastic dominance hold.
"Models for the Assessment of Treatment Improvement: The Ideal and the Feasible." Statist. Sci. 32 (3) 469 - 485, August 2017. https://doi.org/10.1214/17-STS616