Comments

You must log in or register to comment.

Fwahm t1_j26wkxm wrote

It's not to improve accuracy (unless the initial experiment was only accurate enough to be suggestive of its result); it's to remove the possibility of procedural errors, unseen factors, fluke events, or even dishonesty from causing the results of the original experiment to not support its claimed conclusions.

6

Ok_Elk_4333 OP t1_j26wqyj wrote

Thank you for you answer. I don’t understand fluke events tho, my question still stands regarding fluke events from a purely mathematical perspective. The other reasons I get

0

Fwahm t1_j26yiqb wrote

When it comes to statistical measurements, the standard accepted error margin for a new result to be considered a legitimate discovery is very small, but it's still possible for the result to be outside that range by sheer chance.

For example, imagine an experiment that examined cancer rates in connection to smoking said that there was only a 1 in 1 million chance that smoking did not increase chances of getting cancer, and all seeming connections were just a coincidence. That's a very, very low chance of it being unrelated, but it's still possible, and 1 in a million chances happen every day.

If a second experiment is done, using an unrelated dataset, and it also finds the same thing at the same chances, it greatly reduces the chance of the first dataset supporting that conclusion by sheer fluke. It's still not completely impossible, but the chances of both experiments being flukes is exponentially lower than just one of them being one.

4

SurprisedPotato t1_j26zgwq wrote

Suppose you are doing research on jelly beans and their effect on acne.

Suppose also there's actually no effect.

A group of scientists does a study, and finds no effect. Since there's no effect, they don't publish their study.

Around the world, maybe many scientists are doing research on the link (if any) between jelly beans and acne. Maybe it's the color? One group studies purple jelly beans, finds no link, and doesn't publish. Another studies red jelly beans, finds no link, and doesn't publish.

Then one day, just by chance, a group found a relationship that was significant at the 5% level.

This was inevitable, since so many groups of scientists are independently studying the phenomenon, in ignorance of what others are doing.

So now there's a published paper linking green jelly beans to acne.

Even more scientists start doing similar research. What other colours have an effect? Do green jelly worms also "cause" acne?

Since there's a lot of research now, more articles get published - red jelly worms are linked to acne, with a p value of 0.02. Green chiffon cake is linked to acne, with a p-value of 0.03. Nobody publishes the results that show no link.

Eventually, the literature shows a strong relationship between confectionery and acne, especially green, especially with gelatin. Food scientists, dermatologists, regulators rely on this information to provide professional advice and to draft laws. It Science journalists inform the public of this new threat to teen health. Soon "everyone knows" how dangerous green food colouring is...

... But actually no link exists.

If people took the time to replicate the studies, and published the failed replications, this wouldn't happen.

Making the initial paper insist on a stricter level of proof doesn't help, because the whole problem is that negative results aren't being published, and the literature is showing a biased set of results. It would be better to publish the results of every study, so people could see whether that 5% result is something that stands alone, suggesting some real link between two things, or if it's just one of a whole series of similar studies, most of which showed no relationship between the things at all.

2

Triabolical_ t1_j27ivjd wrote

To oversimplify...

In studies you are looking for what is known as statistical significance, which is basically shorthand for it being very likely that the effect you are seeing is a real effect rather than just being an unlikely chance effect.

If you are looking at the effects of a drug, perhaps the effect that you are seeing is just random chance - the people who took the drug just randomly got lucky and the people who didn't take the drug got unlucky.

So you do replication to rule out that chance. If you do two independent drug trials and they both show the same effect, the chance that it is due to random fluctuations is much smaller.

1

Jkei t1_j28bm15 wrote

Batch effects, for one. Something could be wrong about a particular batch of some reagent so that it causes aspecific effects in your assay. You then generate measurements that, sure enough, reach statistical significance. If enough of your publication hinges on that bad data, it could even cause a retraction.

1

Moskau50 t1_j26x408 wrote

How do you quantify "x% chance of inaccuracy"? There may be confounding factors that the original research team was unaware of that will only come up when a different research team, working in a different lab with different conditions and similar-but-not-identical equipment tries to replicate the study and finds something different.

A quick and easy one to think of is water quality. The amount of dissolved minerals in the local tap or well water will vary a lot all over the world. Doing the same benchtop, chemistry 101 experiments using tap water from various places will have slightly different results. Of course, labs nowadays have purified water systems, so water quality itself isn't a concern, but there could be other factors like this that can play a role.

1

grumblingduke t1_j270cni wrote

Science isn't a perfect process; it is done by people, and people make mistakes. Things go wrong, people miss things, or they just get really unlucky.

Replication helps control for that.

You get a different set of people to do the same experiment in the same way, and you should get the same result. If you do, that's a good sign. If not, that's a problem and something that should be looked into.

Ideally then you get a different set of people do the same experiment in a slightly different way. And then a different experiment that measures the same thing, and so on. Lots of replication, all aimed at controlling for things people didn't think about or didn't spot.

Now you could get just one team to do this, over a long period of time, and include it all in a single study, but that is kind of inefficient. Better to do each study separately, then you can publish them individually and other people can have a chance to look at it as well. Plus it is generally a good idea to get a second person or team to work on something.

In a perfect science world you never stop experimenting on something. You never treat it as fully settled, you keep testing your idea until you disprove it, keep trying to find new ways to poke at it and experiment on it.

1

mmmmmmBacon12345 t1_j275g8t wrote

The threshold is that there's less than X% chance this occurred due to random events unless we mucked something up

Replication is critical for showing that something weird actually is happening and there isn't a quirk in the setup.

If you're testing say a sweet electrically powered EmDrive that could be used on spaceships and your test measures something well above the noise floor of the system then the drive works! Right? Mmmm but what if those big ol' power cables happen to be interacting with the Earth's magnetic field? Whoops! Drive is garbage, and the setup had a parameter that wasn't accounted for!

Or maybe you measure some neutrinos traveling faster than the speed of light! But you forgot to account for the time to sync the clock on the surface with the one underground resulting in all of your time measurements on just one end being offset

Just because you can do something once doesn't mean you did what you intended to do. There are plenty of experiments that have produced the desired result because that's what people wanted to happen but were unintentionally setup incorrectly or had quirks which made it look like the desired result was real even though it was something unrelated

1