Can have a square median

Questions and misunderstandings

There are some common questions and misunderstandings in statistics. There is also controversy within statistics about how to deal with certain questions. It is therefore entirely possible to hold views other than those presented here.

Note: Some of the questions are very technical. You don't have to read and understand them all, they are more aimed at people who have "heard something" somewhere and are unsure whether or not you have to do it.


Are there any other measures than mean and standard deviation?

Yes many. Here is a brief overview:

  • Variance: The square of the standard deviation.
  • Median: The median is the value where half of the observations are above and half are below. The median is insensitive to outliers and has the same values ​​as the scale. If the scale is rough, then the median is also a very rough value.
  • Quartile: There where a quarter of the values ​​are below (1st quartile) or where three quarters of the values ​​are below (3rd quartile).
  • Percentile: There where a certain percentage of values ​​are below. The 12th percentile is where 12% of the people have the same or lower values.
  • Mode / mode value: The value that appears most frequently.
  • Range: The entire width from the lowest to the highest observed value. The range is often given to describe the sample. It is too imprecise for content-related statements, since it only depends on two observed values.

Is it allowed to calculate mean values ​​of school grades (or similar scales)?

Yes. The mean is simply a statistical value, i.e. the result of a mathematical calculation. There is no mathematical law that allows or prohibits this calculation under certain circumstances. At most, I have to be careful when interpreting the mean if my sample contains outliers, since the mean is sensitive to outliers.


Do you have to use parametric tests (t-tests etc.) for normally distributed data?

No. Nonparametric tests always work and in most cases are as good or better than parametric tests. So there is really no reason to use a parametric test when there is a nonparametric test that does the same thing.

There are no nonparametric tests for complex analyzes; parametric tests have to be used here.


Should one give medians instead of mean values ​​for nonparametric tests?

No. The median has the disadvantage that it can only assume values ​​that actually appear on the scale. If the scale is coarse, then the median is also coarse and therefore unsuitable for describing differences. In addition, nonparametric tests do not compare the median, but rather the ranking, so the mean rank would be the measure that best corresponds to the test.

If the scale is fine enough, then you can give medians (regardless of which test you are doing). With coarse scales, the mean value is more precise.


How do I check whether the sample is normally distributed?

That's very difficultbecause tests for normal distribution, like all statistical tests, work better when the sample is large. That means: With small samples, I will seldom be able to prove a deviation from the normal distribution, with large samples, however, often. There is no proof that the sample is really normally distributed.


Can a significance be reliable even with a small sample?

Yes - if the sample was really drawn at random. The significance indicates whether the difference is reliable, and a large difference in a small sample can be just as reliable as a small difference in a large sample.


Do I have to calculate the required sample size in advance?

Usually not. The calculation of the required sample size usually requires that I estimate certain things in advance, such as the expected size of the differences or the standard deviation of the values. Often one can simply assume that practically significant differences will also be found with an approximately common sample size (20-50 people).