Chi Squared Tests: Contingency tables

Chi Squared Tests: Contingency tables

Chi-Squared Tests

  • The Chi-squared test is a statistical hypothesis test used to determine if there is a significant association between two categorical variables.
  • Chi-squared distribution has one parameter: the degrees of freedom, denoted by ‘df’ or ‘v’. df is the number of independent pieces of information that go into the computation of the statistic.

Contingency Tables

  • A Contingency Table, also known as a cross tabulation or crosstab, is a type of table in a matrix format that displays the frequency distribution of the variables.
  • In the context of chi-squared tests, these tables usually display the distribution of two categorical variables.

Null and Alternative Hypotheses

  • The null hypothesis, denoted by H0, assumes that there is no association between the two categorical variables (they are independent).
  • The alternative hypothesis, denoted by H1 or Ha, contends that there is an association between the two categorical variables (they are not independent).

Chi-squared Statistic

  • The Chi-squared statistic is calculated from the observed and expected frequencies in the contingency table.
  • The expected frequency of each cell in the contingency table is (row total × column total) / (grand total), under the assumption of null hypothesis.
  • The Chi-squared statistic, denoted by X2, is calculated as the sum of [(Observed frequency - Expected frequency)² / Expected frequency] for all cells in the table.

Applying the Chi-squared Test

  • Compare the computed Chi-squared statistic with the critical value from the Chi-squared distribution table. Only do this if the expected frequencies in all cells of the contingency table are greater than or equal to 5 (this is the condition for performing a Chi-squared test).
  • If the observed Chi-squared statistic is greater than the critical value, we reject the null hypothesis. This means that the variables are not independent (there is a significant association).
  • If the observed Chi-squared statistic is less than or equal to the critical value, we do not reject the null hypothesis. This means the variables are independent (there is no significant association).

Apply the Test Cautiously

  • Remember that, a significant result in the Chi-squared test does not reveal the nature or strength of the association between variables.
  • It’s a good practice to further explore the relationship between variables using other statistical techniques after observing a significant Chi-squared statistic.