Hypothesis testing and experimental design

How do we test theories in science?

Hypothesis testing

Hypothesis testing: comparing observations to an expected null (or uninteresting) hypothesis. If our observations are different than what we would expect from the null, we reject that null.

Null hypothesis (H0): The parameter of interest is zero.

Alternative hypothesis (HA): The parameter of interest is not zero.

Note: We find evidence to reject the null, not accept the alternative (the possible alternatives are potentially infinite).

Example hypothesis

Here the alternative hypothesis allows for values above and below zero, thus it is two-tailed.

Test statistic: the value calculated from our observations to compare to the expected under the null.

Null distribution: the probability distribution of test statistics expected under the null


P-value: the probability of obtaining our results given that the null is true.

We can calculate the P-value here by taking the sum of the area under the curve of our null distribution where our results and any more extreme results fall (and where the opposite extremes fall, because it is two-tailed).

P = 0.016 + 0.001 = 0.017

A one-tailed test would calculate P using only the left or right of the null distribution.

Decision rule

When working with P-value, we typically use a standard significance value, or α = 0.05. Which means that we need to be 95% certain that our results are not due to random chance.

For the example above, P = 0.017, which is less than 0.05, therefore we reject the null that mating was random (see lesson .RMD for more information).


Type I error: rejecting a true null hypothesis. The probability of committing a Type I error = α.

Type II error: failing to reject a false null hypothesis. A test has more power if there is a lower likelihood of committing a Type II error.

Experimental design

Types of studies

  • Clinical study: two or more treatments are assigned to human subjects
  • Laboratory study: two or more treatments are assigned to non-human subjects
  • Field study: two or more treatments are assigned to non-human subjects in nature.

Why well planed experiments are important

  • Bias is reduced
    • Assigning control groups
    • Treatments assigned at random
    • Blinding to reduce conscious and unconscious bias
  • Sampling error is reduced
    • Treatment can be replicated multiple times
    • Experimental units can be balanced (same number of units per treatment)
    • Treatment replicates can be blocked together, to reduce effects of environmental variation

Sometimes, observational studies must be done in lieu of an experimental study. While there is less opportunity for randomization, all other features of an experiment can be used.