Informal Hypothesis Testing for Correlation/Association – A Level Mathematics B (MEI) OCR Revision

Informal Hypothesis Testing for Correlation/Association

Understanding Correlation

Correlation measures the strength and direction of a linear relationship between two variables.
It’s measured on a scale between -1 and 1, with -1 indicating a perfect negative correlation, 1 indicating a perfect positive correlation, and 0 indicating no correlation.

Sample Correlation Coefficient (r)

The Sample Correlation Coefficient (r) assesses the degree of linear association between two variables, X and Y, in a sample.
The value of r ranges from -1 to 1, with -1 showing perfect negative correlation, 0 showing no correlation, and 1 indicating perfect positive correlation.
High absolute values of r (close to 1 or -1) suggest a strong association.

Null and Alternative Hypotheses for Correlation

The Null Hypothesis (H₀) in testing correlation assumes that there is no association between the two variables, i.e., the population correlation coefficient (ρ) is 0.
The Alternative Hypothesis (H₁) suggests that there is an association; i.e., the population correlation coefficient is not 0. This is a non-directional (two-sided) test.
When using a directional (one-side) test, the alternative hypothesis could state that the correlation is less than 0 (negative correlation) or more than 0 (positive correlation).

Significance Level and Decision Rule

The Significance Level (commonly 0.05) is the probability of incorrectly rejecting the null hypothesis if it is true.
The decision to reject or not reject the null hypothesis is based on the observed value of r and its corresponding p-value compared to the significance level.
If the p-value is less than or equal to the significance level, the null hypothesis is rejected in favour of the alternative hypothesis, suggesting a statistically significant correlation.

Association in Scatter Plots

Evidence of a potential association between two variables can initially be checked using a scatter plot.
Points scattered randomly suggest no correlation. A clear upward trend is indicative of positive correlation, while a clear downward trend suggests negative correlation.
However, note that correlation does not imply causation and that scatter plots do not prove a cause-and-effect relationship.

Testing for Correlation with Large Sample Sizes

With large sample sizes, a weak or modest correlation coefficient can still achieve statistical significance.
For this reason, it’s crucial to also assess the confidence intervals for the population correlation coefficient, providing a range of plausible values.
A confidence interval that includes 0 indicates that the null hypothesis of no correlation cannot be rejected at the chosen significance level.

Power and Sample Size in Association Tests

A larger sample size increases the power of the correlation test, i.e., the probability of correctly rejecting a false null hypothesis. This is because it reduces the standard error, making the test statistic larger.
Conversely, a small sample size may fail to detect a true correlation due to low power (higher chance of a Type II error).
As with any statistical test, both the statistical significance and the practical importance of the result should be considered. A statistically significant correlation may not necessarily be practically important, particularly if the correlation is weak.