Some notes on statistics

I won’t go into detail in this, but I wanted to point out the meaning of the word “significant” when it comes to statistics.

Say you had 10 animals. You feed 5 of them a supplement, and the others you leave alone.

If all 5 animals that had the supplement survive while the other 5 all die, then you’re pretty sure that the supplement was a good thing. You don’t know if it was just slightly better or very much better, but somewhere in that ballpark.

Or supposed that 3 animals you did not give the supplement to died but only 2 of the animals you did give the supplement to died. Because there’s closer together, it’s hard to tell whether the supplement helped or not. That’s because the variance, the error bars, the uncertainty around your statistics is too big with such a small sample size.

If there isn’t a big effect, then you need a much larger sample size to determine whether there is an effect at all. The larger sample size gives you more precise results with less variance / error bars / uncertainty. So you can tell whether or not it’s a good idea to feed that supplement to the animals.

Too often you’ll hear about a study that says “Do X and Y will happen” or something like that. It’s not easy to determine whether they are telling the truth or not. Especially when the company touting the results is aiming to make a lot of money, or more and more money as people believe what they say. It’s important to read the study and figure out whether they had significant results or not. It’s also important to read what they actually did in the study, to make sure they’re not comparing apples to oranges.

In the end it’s up to us to figure out what is really true and what is just made up nonsense. Understanding sample sizes and significance can go a long way to that end.

You’ll note that much of the time you’ll hear “the results were not significant” when people report on a study. More often than not, that’s the reality: we just don’t know if there was a real effect or not — further research with larger sample sizes are needed.