# Overview of Bivariate Statistics

• Bivariate statistics analyses the relationship between two variables. It is an essential part of statistical methodology.
• This method determines the degree of correlation between the two variables, the direction of this correlation, and its graphical representation.

# Key Concepts

• Scatterplot: A graphical representation of the degree and direction of correlation between two variables.
• Line of Best Fit/Regresion Line: A line which depicts the trend observed in the scatterplot.
• Correlation coefficient (r) or (rho): A numerical value typically ranging from -1 to 1 representing the degree of correlation between two variables.

# Drawing Scatter Plots and Lines of Best Fit

• A scatterplot is prepared with the independent variable on the horizontal axis and the dependent variable on the vertical axis.
• Each data point is represented by a dot on the scatterplot.
• The line of best fit is drawn in such a way that it minimizes the distance between the line and all the plotted points.

# Calculating the Correlation Coefficient

• Positive correlation: As one variable increases, so does the other resulting in a correlation coefficient between 0 and 1.
• Negative correlation: As one variable increases, the other decreases resulting in a correlation coefficient between 0 and -1.
• A correlation coefficient near to 0 suggests a weak or non-existent relationship between variables.

# Interpreting the Correlation Coefficient

• A positive correlation coefficient signifies direct proportionality between variables, while a negative one shows inverse proportionality.
• The closer the absolute value of the correlation coefficient to 1, the stronger the relationship between variables.
• A correlation coefficient of 0 signifies no linear relationship between the variables.

# Types of Correlation

• Perfect positive correlation: A correlation coefficient of 1 where all points lie on a line sloping upwards.
• Perfect negative correlation: A correlation coefficient of -1 where all points lie on a line sloping downwards.
• No correlation: A correlation coefficient of 0 where points are widely scattered and depict no pattern.

# Data Transformation

• Data transformation can be used to achieve linearity in the correlation, such as square root transformation, log transformation, etc.
• Only apply transformations to data where the correlation is not linear when plotted in its original form.

# Causality

• While bivariate statistics can identify relationships between variables, they do not necessarily demonstrate cause and effect.
• Be wary of the potential for confounding variables to inaccurately suggest a direct relationship between two variables.