Calculating Correlation
Calculating Correlation
Understanding Correlation
- Correlation is a statistical measure that indicates the extent to which two or more variables fluctuate together.
- A positive correlation means that as one variable increases, the other also increases; and as one decreases the other also decreases.
- A negative correlation represents the opposite; when one variable increases, the other decreases.
- Correlation coefficients range from -1 to 1. A value of +/- 1 indicates a perfect degree of linear association between two variables.
Calculating the Correlation Coefficient
- The correlation coefficient, often denoted by r, is a measure that determines the degree to which two variables’ movements are associated.
- It is calculated using the Pearson Product-Moment Correlation formula.
- A correlation coefficient near to +1 or -1 shows a strong correlation, while a correlation coefficient near to 0 shows a weak correlation.
- To calculate the correlation coefficient, first calculate the covariance of the two variables, then divide by the product of their standard deviations.
Interpreting Correlation Coefficients
- A correlation coefficient between 0 and 0.3 (or 0 and -0.3) shows a weak correlation.
- A correlation coefficient between 0.3 and 0.7 (-0.3 and -0.7) shows a moderate correlation.
- A correlation coefficient between 0.7 and 1.0 (-0.7 and -1.0) shows a strong correlation.
Properties of the Correlation Coefficient
- The correlation coefficient is symmetric, meaning correlation from X to Y is the same as correlation from Y to X.
- The correlation coefficient has the unitless property, which allows you to compare correlation coefficients across different pairs of variables.
- The sign (+/-) of the correlation coefficient represents the direction of the relationship between two variables.
- Correlation does not imply causation. A high correlation between two variables X and Y does not mean that changes in X cause changes in Y (and vice versa).
Spearman’s Rank Correlation Coefficient
- When the normal Pearson Product-Moment Correlation might not reflect the true relationship between two variables, we can use the Spearman’s rank correlation coefficient.
- It is used when the relationship between the variables is nonlinear or when the variables are both ranked.
- This rank version of correlation coefficient is calculated the same way as the standard correlation coefficient, but with ranked values.