usefully_useless

usefully_useless t1_ix4gguz wrote

This exactly. Assuming the 20 factors are independent and that the true effect of each factor is zero (i.e. none of them actually do anything), then when using a 5% significance level the probability of finding statistical significance in at least one of the factors (at least one false positive) is about 64%.

There’s a reason that we’re facing a replication crisis, and that reason is the prevalence of p-hacking. (There’s an argument that the overwhelming preference for positive results in academic journals and the publication requirements most departments have for tenure are indirectly responsible for this problem as well, but that’s a different discussion.)

10