# Chi Squared Tests: Contingency tables

## Chi-Squared Tests

• The Chi-squared test is a statistical hypothesis test used to determine if there is a significant association between two categorical variables.
• Chi-squared distribution has one parameter: the degrees of freedom, denoted by ‘df’ or ‘v’. df is the number of independent pieces of information that go into the computation of the statistic.

## Contingency Tables

• A Contingency Table, also known as a cross tabulation or crosstab, is a type of table in a matrix format that displays the frequency distribution of the variables.
• In the context of chi-squared tests, these tables usually display the distribution of two categorical variables.

## Null and Alternative Hypotheses

• The null hypothesis, denoted by H0, assumes that there is no association between the two categorical variables (they are independent).
• The alternative hypothesis, denoted by H1 or Ha, contends that there is an association between the two categorical variables (they are not independent).

## Chi-squared Statistic

• The Chi-squared statistic is calculated from the observed and expected frequencies in the contingency table.
• The expected frequency of each cell in the contingency table is (row total × column total) / (grand total), under the assumption of null hypothesis.
• The Chi-squared statistic, denoted by X2, is calculated as the sum of [(Observed frequency - Expected frequency)² / Expected frequency] for all cells in the table.

## Applying the Chi-squared Test

• Compare the computed Chi-squared statistic with the critical value from the Chi-squared distribution table. Only do this if the expected frequencies in all cells of the contingency table are greater than or equal to 5 (this is the condition for performing a Chi-squared test).
• If the observed Chi-squared statistic is greater than the critical value, we reject the null hypothesis. This means that the variables are not independent (there is a significant association).
• If the observed Chi-squared statistic is less than or equal to the critical value, we do not reject the null hypothesis. This means the variables are independent (there is no significant association).

## Apply the Test Cautiously

• Remember that, a significant result in the Chi-squared test does not reveal the nature or strength of the association between variables.
• It’s a good practice to further explore the relationship between variables using other statistical techniques after observing a significant Chi-squared statistic.