Statistical proof? The problem of irreproducibility

Author: Susan Holmes
Published electronically: October 4, 2017
Abstract: Data currently generated in the fields of ecology, medicine, climatology, and neuroscience often contain tens of thousands of measured variables. If special care is not taken, the complexity associated with statistical analysis of such data can lead to publication of results that prove to be irreproducible.

The field of modern statistics has had to revisit the classical hypothesis testing paradigm to accommodate modern high-throughput settings. A first step is correction for multiplicity in the number of possible variables selected as significant using multiple hypotheses correction to ensure false discovery rate (FDR) control (Benjamini, Hochberg, 1995). FDR adjustments do not solve the problem of double dipping the data, and recent work develops a field known as post-selection inference that enables inference when the same data is used both to choose and to evaluate models.

It remains that the complexity of software and flexibility of choices in tuning parameters can bias the output toward inflation of significant results; neuroscientists recently revisited the problem and found that many fMRI studies have resulted in false positives.

Unfortunately, all formal correction methods are tailored for specific settings and do not take into account the flexibility available to today's statisticians. A constructive way forward is to be transparent about the analyses performed, separate the exploratory and confirmatory phases of the analyses, and provide open access code; this will result in both enhanced reproducibility and replicability.

References [Enhancements On Off] (What's this?)

Susan Holmes

Susan Holmes
Affiliation: Statistics Department Sequoia Hall, Stanford, California 94305

Additional Notes: This work was supported by a Stanford Gabilan fellowship.
