Statistical Hypothesis Testing
Understanding Statistical Hypothesis Testing
-
Statistical Hypothesis Testing is a method used in statistics to make decisions or draw conclusions about populations from sample data.
-
It involves stating two competing hypotheses about a population: the null hypothesis (H0) and the alternative hypothesis (H1).
-
The null hypothesis (H0) is the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.
-
The alternative hypothesis (H1 or Ha) is the hypothesis that is contrary to the null hypothesis. It is often taken to be that the observations show a real effect combined with a component of chance variation.
Types of Tests
-
There are different types of statistical tests depending on the characteristics of the data and the question being asked: t-tests, chi-squared tests, ANOVA, etc.
-
A one-tailed test is a statistical hypothesis test in which the values that reject the null hypothesis are located entirely in one tail of the probability distribution.
-
A two-tailed test is a statistical test in which the area of rejection is on both sides of the sampling distribution.
Test Statistics
-
The test statistic is a mathematical formula that allows one to transform the raw data into a standardised form to make a decision about the null hypothesis. The nature of this statistic depends on the type of test being used.
-
The p-value is the probability, under the assumption of null hypothesis, of obtaining a result equal to or more extreme than what was actually observed.
Decision Making
-
If the p-value is low (usually less than 5%, or 0.05), one rejects the null hypothesis. This threshold value is also known as the significance level (α).
-
Type I error (α) occurs when the null hypothesis is true, but is rejected. It is often called “false positive”.
-
On the other hand, Type II error (β) takes place when the null hypothesis is false, but is accepted. It is often referred to as “false negative”.
-
In the context of the balance between Type I and Type II errors, the concept of power of a test is introduced. It is the probability that it will reject a false null hypothesis.
Confidence Intervals
-
Confidence intervals are another method used to infer the value of a population parameter. An interval estimate provides more information about a population characteristic than the point estimate (single figure) of that characteristic.
-
Once you have a sample statistic, you can create confidence intervals to describe the uncertainty in your estimate. These are often reported alongside point estimates, for example, on opinion polls.
Interpreting Results
-
Understanding the subtleties around interpreting results is key. A statistically significant test result (p-value < 0.05) does not always lead to a practical significance.
-
Similarly, a test result that is not statistically significant (p-value > 0.05) does not mean the null hypothesis is true. It only provides insufficient evidence to support the alternative hypothesis.
-
Care needs to be taken when relating the outcomes of statistical hypothesis testing to real-world situations. Always interpret test results within the context provided.