Nonparametric tests

Signs and ranks

Up to this point, we have been testing whether the assumptions of our particular test has been met by our data. However, it is not guaranteed that our assumptions will be met in all cases. If we find that the assumptions have been violated, we have a few options.

Option 1: Ignore violations of assumptions

This usually only works when we are comparing means and the normality assumption is violated. If our n is high, we can ignore slight non-normality. We can’t really do this for differences in variance, especially if our samples are small or our uneven.

Option 2: Transform your data

We can use different transformations on the data and then retest the assumptions.

Important rules:

1. While analyses should be done using transformed data, data should be presented and reported untransformed

2. All data used in a particular study needs to be transformed using the same technique.

There are a wide variety of transformations we can do.

  • Natural log transformation: ln(y + 1)
  • Log 10 transformation: log10(y + 1)
  • Square-root transformation: sqrt(y + 0.5)
  • Arcsine transformation (for proportions): arcsin(sqrt(p))
  • Reciprocal transformation: 1 / y

Option 3: Use nonparametric tests

If transformation does not work, nonparametric tests, which do not rely on population parameters, can be used instead. There are a number of analogous nonparametric tests we can use in place of parametric tests. Nonparametric tests tend to have a lower power than parametric tests.

Sign test: Replaces one-sample t-test or paired t-test. Basically, results above the median are assigned a “+” and those below are assigned a “-“. It is very similar to a binomial test. Assumes all samples are random samples from the population.

Mann-Whitney U-test/Wilcoxon rank sum test: Replaces the two-sample t-test. This test uses the ranks (smallest to largest) of the data instead of the absolute values. Assumes all samples are random samples from the population and that the distributions of the two groups are the same shape.

df <- subset(iris, iris$Species == "setosa")
wilcox.test(df$Sepal.Width, df$Sepal.Length)
view raw wilcox_test.r hosted with ❤ by GitHub

Kruskal-Wallis test: Replaces the ANOVA. This test is based on ranks, similarly to the Wilcoxon rank sum test and Mann-Whitney U-test. Assumes all samples are random samples from the population and that the distributions of the groups are the same shape.

df <- iris
kruskal.test(Petal.Width ~ Species, data = df)