# Informal Hypothesis Testing for Correlation/Association

**Informal Hypothesis Testing for Correlation/Association**

**Understanding Correlation**

**Correlation**measures the strength and direction of a linear relationship between two variables.- It’s measured on a scale between -1 and 1, with -1 indicating a perfect negative correlation, 1 indicating a perfect positive correlation, and 0 indicating no correlation.

**Sample Correlation Coefficient (r)**

- The
**Sample Correlation Coefficient (r)**assesses the degree of linear association between two variables, X and Y, in a sample. - The value of r ranges from -1 to 1, with -1 showing perfect negative correlation, 0 showing no correlation, and 1 indicating perfect positive correlation.
- High absolute values of r (close to 1 or -1) suggest a strong association.

**Null and Alternative Hypotheses for Correlation**

- The
**Null Hypothesis (H**in testing correlation assumes that there is no association between the two variables, i.e., the population correlation coefficient (ρ) is 0._{0}) - The
**Alternative Hypothesis (H**suggests that there is an association; i.e., the population correlation coefficient is not 0. This is a non-directional (two-sided) test._{1}) - When using a directional (one-side) test, the alternative hypothesis could state that the correlation is less than 0 (negative correlation) or more than 0 (positive correlation).

**Significance Level and Decision Rule**

- The
**Significance Level**(commonly 0.05) is the probability of incorrectly rejecting the null hypothesis if it is true. - The decision to reject or not reject the null hypothesis is based on the observed value of r and its corresponding
**p-value**compared to the significance level. - If the p-value is less than or equal to the significance level, the null hypothesis is rejected in favour of the alternative hypothesis, suggesting a statistically significant correlation.

**Association in Scatter Plots**

- Evidence of a potential association between two variables can initially be checked using a
**scatter plot**. - Points scattered randomly suggest no correlation. A clear upward trend is indicative of positive correlation, while a clear downward trend suggests negative correlation.
- However, note that correlation does not imply causation and that scatter plots do not prove a cause-and-effect relationship.

**Testing for Correlation with Large Sample Sizes**

- With large sample sizes, a weak or modest correlation coefficient can still achieve statistical significance.
- For this reason, it’s crucial to also assess the
**confidence intervals**for the population correlation coefficient, providing a range of plausible values. - A confidence interval that includes 0 indicates that the null hypothesis of no correlation cannot be rejected at the chosen significance level.

**Power and Sample Size in Association Tests**

- A larger sample size increases the
**power**of the correlation test, i.e., the probability of correctly rejecting a false null hypothesis. This is because it reduces the standard error, making the test statistic larger. - Conversely, a small sample size may fail to detect a true correlation due to low power (higher chance of a Type II error).
- As with any statistical test, both the statistical significance and the practical importance of the result should be considered. A statistically significant correlation may not necessarily be practically important, particularly if the correlation is weak.