# Bivariate Data, Association and Correlation

Defining Bivariate Data, Association and Correlation

• Bivariate data involves two different variables gathered from the same entity, and can be graphically represented in a scatterplot.
• Association in bivariate data is evident when changes in one variable may affect changes in the other variable.
• Correlation is the statistical measure that describes the degree of relationship between two variables.

Types of Association

• Positive Association: As the value of one variable increases, the value of the other variable also increases and vice versa.
• Negative Association: When the value of one variable increases, the value of the other variable decreases, or vice versa.
• No association: No apparent relationship is apparent between the two variables.

Correlation Coefficient

• The correlation coefficient, denoted by ‘r’, measures the strength and direction of a linear relationship between two variables on a scatterplot.
• It ranges from -1 to +1 where -1 signifies a perfectly negative linear correlation, +1 a perfectly positive, and 0 indicates no linear correlation.

Interpreting the Correlation Coefficient

• A correlation coefficient close to 1 indicates a strong positive linear relationship, a value close to -1 denotes a strong negative linear relationship.
• A correlation coefficient close to 0 suggests a weak or non-existent linear relationship.
• Note that correlation does not imply causation; even though two variables may be strongly correlated, it does not mean that changes in one variable cause changes in the other.

Spearman’s Rank Correlation Coefficient

• Spearman’s Rank correlation coefficient works on ranked data and is a commonly used alternative to the correlation coefficient when data doesn’t follow a normal distribution.
• It can detect any monotonic relationship (increasing or decreasing) as opposed to just linear.

Calculating and Interpreting Bivariate Data

• Bivariate data is typically presented in a table, scatter plot, or correlation matrix.
• When calculating correlation, ensure the appropriateness of data for the calculation — linear relationships for correlation coefficient and monotonic relationships for Spearman’s Rank.
• Always consider outliers, as these can have a significant impact on the correlation.
• The line of best fit, median-median line, or least squares line are used to provide visual interpretation of the data relationship.
• Remember to interpret the relationship within the context of the data, considering external influences or explanations.