Comparing means of more than two groups
We can use the analysis of variance (ANOVA) is a special type of non-parametric test used to compare means between normally distributed populations from more than groups.
- Samples are taken randomly
- Measurements from each population is normally distributed
- The variances are equal between all populations
MSgroups: mean square of groups
MSerror: mean square of error
Calculating the ANOVA test statistic
Step 1: Partition the sum of squares
Calculate a grand mean by taking the sum of the product of the means and sample size of each group divided by the N total number of observations.
Sum of squares of the groups
Sum of squares of the error
Step 2: Calculate the mean squares
Mean square groups
Mean square error
k = number of groups
Step 3: Build ANOVA table
|Source of variance||Sum of squares||df||Mean squares||F||P|
|Groups||SSgroups||groups – 1||MSgroups||MSgroups / MSerror||P-value|
|Error||SSerror||observations – groups||MSerror|
|Total||SStotal||dferror + dfgroups|
Step 4: If the null is rejected, perform a post-hoc test to determine differences between groups
This can be done using a Tukey-Kramer test
Determining the variance explained by differences in groups
The normality assumption can be checked visually by looking at a Q-Q plot of the residuals. If the points fit the straight line well, we can claim that they are normally distributed.
Homogeneity of variances can be checked either by using a Levene’s test or Bartlett’s test.