Calculating Correlation

Calculating Correlation

Understanding Correlation

  • Correlation is a statistical measure that indicates the extent to which two or more variables fluctuate together.
  • A positive correlation means that as one variable increases, the other also increases; and as one decreases the other also decreases.
  • A negative correlation represents the opposite; when one variable increases, the other decreases.
  • Correlation coefficients range from -1 to 1. A value of +/- 1 indicates a perfect degree of linear association between two variables.

Calculating the Correlation Coefficient

  • The correlation coefficient, often denoted by r, is a measure that determines the degree to which two variables’ movements are associated.
  • It is calculated using the Pearson Product-Moment Correlation formula.
  • A correlation coefficient near to +1 or -1 shows a strong correlation, while a correlation coefficient near to 0 shows a weak correlation.
  • To calculate the correlation coefficient, first calculate the covariance of the two variables, then divide by the product of their standard deviations.

Interpreting Correlation Coefficients

  • A correlation coefficient between 0 and 0.3 (or 0 and -0.3) shows a weak correlation.
  • A correlation coefficient between 0.3 and 0.7 (-0.3 and -0.7) shows a moderate correlation.
  • A correlation coefficient between 0.7 and 1.0 (-0.7 and -1.0) shows a strong correlation.

Properties of the Correlation Coefficient

  • The correlation coefficient is symmetric, meaning correlation from X to Y is the same as correlation from Y to X.
  • The correlation coefficient has the unitless property, which allows you to compare correlation coefficients across different pairs of variables.
  • The sign (+/-) of the correlation coefficient represents the direction of the relationship between two variables.
  • Correlation does not imply causation. A high correlation between two variables X and Y does not mean that changes in X cause changes in Y (and vice versa).

Spearman’s Rank Correlation Coefficient

  • When the normal Pearson Product-Moment Correlation might not reflect the true relationship between two variables, we can use the Spearman’s rank correlation coefficient.
  • It is used when the relationship between the variables is nonlinear or when the variables are both ranked.
  • This rank version of correlation coefficient is calculated the same way as the standard correlation coefficient, but with ranked values.