Multiverse analysis is a scientific method that specifies and then runs a set of plausible alternative models or statistical tests for a single hypothesis. It is a method to address the issue that the "scientific process confronts researchers with a multiplicity of seemingly minor, yet nontrivial, decision points, each of which may introduce variability in research outcomes". A problem also known as

Researcher degrees of freedom Researcher degrees of freedom is a concept referring to the inherent flexibility involved in the process of designing and conducting a scientific experiment, and in analyzing its results. The term reflects the fact that researchers can choose betw ...

or as the garden of forking paths. It is a method arising in response to the credibility and

replication crisis The replication crisis, also known as the reproducibility or replicability crisis, refers to the growing number of published scientific results that other researchers have been unable to reproduce or verify. Because the reproducibility of empir ...

taking place in science, because it can diagnose the fragility or robustness of a study's findings. Multiverse analyses have been used in the fields of psychology and neuroscience. It is also a form of

meta-analysis Meta-analysis is a method of synthesis of quantitative data from multiple independent studies addressing a common research question. An important part of this method involves computing a combined effect size across all of the studies. As such, th ...

allowing researchers to provide evidence on how different model specifications impact results for the same hypothesis, and thus can point scientists toward where they might need better theory or causal models.

Details

Multiverse analysis most often produces a large number of results that tend to go in all directions. This means that most studies do not offer consensus or specific rejection of an hypothesis. Its strongest utilities thus far are instead to provide evidence against conclusions based on findings from single studies or to provide evidence about which model specifications are more or less likely to cause larger or more robust effect sizes (or not). Evidence against single studies or statistical models, is useful in identifying potential

false positive A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test resu ...

results. For example, a now infamous study concluded that female gender named hurricanes are more deadly than male gender named hurricanes. In a follow up study, researchers ran thousands of models using the same hurricane data, but making various plausible adjustments to the regression model. By plotting a density curve of all regression coefficients, they showed that the coefficient of the original study was an extreme outlier. In a study of birth order effects, researchers visualized a multiverse of plausible models using a specification curve which allows researchers to visually inspect a plot of all model outcomes against various model specifications. They could show that their findings supported previous research of birth order on intellect, but provided evidence against an effect on life satisfaction and various personality traits.

References

{{Reflist Multiverse