Estimating statistical power, posterior probability and publication bias of psychological research using the observed replication rate

In this paper, we show how Bayes' theorem can be used to better understand the implications of the 36% reproducibility rate of published psychological findings reported by the Open Science Collaboration. We demonstrate a method to assess publication bias and show that the observed reproducibility rate was not consistent with an unbiased literature. We estimate a plausible range for the prior probability of this body of research, suggesting expected statistical power in the original studies of 48–75%, producing (positive) findings that were expected to be true 41–62% of the time. Publication bias was large, assuming a literature with 90% positive findings, indicating that negative evidence was expected to have been observed 55–98 times before one negative result was published. These findings imply that even when studied associations are truly NULL, we expect the literature to be dominated by statistically significant findings.


Solving equation 6 to find from
The system of equations defined by equation 6 in the main text (replicated below with subscripts o for quantities related to original studies and r for replication studies) needs to be solved for unique values of the reproducibility rate ( ), the assumed type-2 error rate in the replication studies ( ) as well as the type-1 error rate of the original ( ) and replication studies ( ). This can be simplified using a computerized equation solver and cross checking the math of the suggested solution. Syntax for solving the equations using a web based equation solver ( www.wolframalpha.com ) together with R-code for cross checking the math is given below.
Eq (S1) Since the picked equation solver was somewhat limited in the choice of symbols, equation S1 was rewritten in plain text like so: P=(theta*(1-beta))/(theta*(1-beta)+alpha*(1-theta)); R=P*(1-b)+a*(1-P); To find the solution for the lower bound of the range, in which the true value must fall (discussed in detail in the main text), we assume identical power in the original and replication studies, and added the following constraints: alpha=.05; R=.36; a=.025; b=beta; 0<theta<1; And to finish the command we added instructions to solve for and : solve beta and P This produced the following solution for the lower bound of the range in which the true value must fall (see the main text): Solving the equation for the upper bound of this range produced the following solution: Solving the equation for the lower end of the more narrow likely range (see discussion in the main text) produced the following solution (note, this solution is valid for the whole range but the solver erroneously produced a constant solution for the specific case of ): Solving the equation for the upper end of the likely range produced the following solution (note, similar to above we got an erroneous constant solution for ):

Summary of estimates for all distributions presented in figure 1
Figure S1 : Expected statistical power for the four conditions. The top two panels describe the outer bounds and the bottom two describe the limits of the likely interval. Estimates are presented for the naive analytical assuming zero variance in power (zero) and for Beta distributions with shape parameter s50-s3i (s=50, 1, 1/2, 1/3) and for bimodal distributions with location means at 10/90th percentiles of the distribution with s=1 (s1b1090) and s=2 (s2b1090) and at 05/95th percentiles with s=1 (s1b0595).
- Figure S2 : Expected posterior probability for the four conditions. The top two panels describe the outer bounds and the bottom two describe the limits of the likely interval.

Figure S3
: Expected positive evidence for the four conditions. The top two panels describe the outer bounds and the bottom two describe the limits of the likely interval. Estimates are presented for the naive analytical assuming zero variance in power (zero) and for Beta distributions with shape parameter s50-s3i (s=50, 1, 1/2, 1/3) and for bimodal distributions with location means at 10/90th percentiles of the distribution with s=1 (s1b1090) and s=2 (s2b1090) and at 05/95th percentiles with s=1 (s1b0595).

Figure S1
: Expected publication bias for the four conditions. The top two panels describe the outer bounds and the bottom two describe the limits of the likely interval. Estimates are presented for the naive analytical assuming zero variance in power (zero) and for Beta distributions with shape parameter s50-s3i (s=50, 1, 1/2, 1/3) and for bimodal distributions with location means at 10/90th percentiles of the distribution with s=1 (s1b1090) and s=2 (s2b1090) and at 05/95th percentiles with s=1 (s1b0595).