Hypothesis tests
Hypothesis tests
Hypothesis Testing: An Overview
-
Hypothesis testing is a statistical method used to make inferences or draw conclusions about a population based on a sample of data.
-
Involves the formulation of null and alternative hypothesis. The null hypothesis assumes no effect, while the alternative hypothesis assumes an effect does exist.
-
The test statistic based on the sample data is calculated and the probability (P-value) of observing a test statistic as extreme or more extreme is checked.
-
If the p-value is less than or equal to the significance level, reject the null hypothesis. If the p-value is greater, fail to reject the null hypothesis.
T-Distribution
-
Created by William Sealy Gosset (under the pseudonym ‘Student’) the t-distribution is a probability distribution frequently used for hypothesis testing when the sample size is small and/or when the population variance is unknown.
-
Similar to the normal distribution, but has heavier tails, which makes it more appropriate for small sample sizes or unknown variances.
-
The shape of a t-distribution is dependent on its degrees of freedom. As degrees of freedom increase, the t-distribution approaches a normal distribution.
T-Tests
-
T-tests are statistical hypotheses tests which use the t-distribution. The t statistic is calculated from the sample data and compared to critical values.
-
Kin to the z-test but utilised when the sample size is small and the population variance is unknown.
-
Types of T-tests: One-sample t-test (comparing sample mean to a known value), Two-sample t-test (comparing two independent samples), Paired t-test (comparing paired measurements).
Assumptions of T-tests
-
Normality: Data should approximately follow a normal distribution. Relaxation of this assumption is permissible with larger sample sizes due to Central Limit Theorem.
-
Independence: Observations in the sample must be independent of each other.
-
Homoscedasticity: For a two-sample t-test, the variances of the two populations being compared should be equal. Not required for one-sample t-tests.
Caution with T-tests
-
If assumptions are violated, t-tests can lead to inaccurate conclusions. Non-parametric tests can be an alternative when assumptions are not met.
-
T-tests do not comment on the magnitude of difference, only whether a statistically significant difference exists or not. Effect size should be considered.
-
If multiple t-tests are conducted, false discovery rate may increase. Techniques like Bonferroni correction may be used to control for this.