Statistical Hypothesis Testing – A Level Mathematics WJEC Revision

Understanding Statistical Hypothesis Testing

Statistical Hypothesis Testing is a method used in statistics to make decisions or draw conclusions about populations from sample data.
It involves stating two competing hypotheses about a population: the null hypothesis (H₀) and the alternative hypothesis (H₁).
The null hypothesis (H₀) is the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.
The alternative hypothesis (H₁ or H_a) is the hypothesis that is contrary to the null hypothesis. It is often taken to be that the observations show a real effect combined with a component of chance variation.

Types of Tests

There are different types of statistical tests depending on the characteristics of the data and the question being asked: t-tests, chi-squared tests, ANOVA, etc.
A one-tailed test is a statistical hypothesis test in which the values that reject the null hypothesis are located entirely in one tail of the probability distribution.
A two-tailed test is a statistical test in which the area of rejection is on both sides of the sampling distribution.

Test Statistics

The test statistic is a mathematical formula that allows one to transform the raw data into a standardised form to make a decision about the null hypothesis. The nature of this statistic depends on the type of test being used.
The p-value is the probability, under the assumption of null hypothesis, of obtaining a result equal to or more extreme than what was actually observed.

Decision Making

If the p-value is low (usually less than 5%, or 0.05), one rejects the null hypothesis. This threshold value is also known as the significance level (α).
Type I error (α) occurs when the null hypothesis is true, but is rejected. It is often called “false positive”.
On the other hand, Type II error (β) takes place when the null hypothesis is false, but is accepted. It is often referred to as “false negative”.
In the context of the balance between Type I and Type II errors, the concept of power of a test is introduced. It is the probability that it will reject a false null hypothesis.

Confidence Intervals

Confidence intervals are another method used to infer the value of a population parameter. An interval estimate provides more information about a population characteristic than the point estimate (single figure) of that characteristic.
Once you have a sample statistic, you can create confidence intervals to describe the uncertainty in your estimate. These are often reported alongside point estimates, for example, on opinion polls.

Interpreting Results

Understanding the subtleties around interpreting results is key. A statistically significant test result (p-value < 0.05) does not always lead to a practical significance.
Similarly, a test result that is not statistically significant (p-value > 0.05) does not mean the null hypothesis is true. It only provides insufficient evidence to support the alternative hypothesis.
Care needs to be taken when relating the outcomes of statistical hypothesis testing to real-world situations. Always interpret test results within the context provided.