Bivariate Statistics
Overview of Bivariate Statistics
- Bivariate statistics analyses the relationship between two variables. It is an essential part of statistical methodology.
- This method determines the degree of correlation between the two variables, the direction of this correlation, and its graphical representation.
Key Concepts
- Scatterplot: A graphical representation of the degree and direction of correlation between two variables.
- Line of Best Fit/Regresion Line: A line which depicts the trend observed in the scatterplot.
- Correlation coefficient (r) or (rho): A numerical value typically ranging from -1 to 1 representing the degree of correlation between two variables.
Drawing Scatter Plots and Lines of Best Fit
- A scatterplot is prepared with the independent variable on the horizontal axis and the dependent variable on the vertical axis.
- Each data point is represented by a dot on the scatterplot.
- The line of best fit is drawn in such a way that it minimizes the distance between the line and all the plotted points.
Calculating the Correlation Coefficient
- Positive correlation: As one variable increases, so does the other resulting in a correlation coefficient between 0 and 1.
- Negative correlation: As one variable increases, the other decreases resulting in a correlation coefficient between 0 and -1.
- A correlation coefficient near to 0 suggests a weak or non-existent relationship between variables.
Interpreting the Correlation Coefficient
- A positive correlation coefficient signifies direct proportionality between variables, while a negative one shows inverse proportionality.
- The closer the absolute value of the correlation coefficient to 1, the stronger the relationship between variables.
- A correlation coefficient of 0 signifies no linear relationship between the variables.
Types of Correlation
- Perfect positive correlation: A correlation coefficient of 1 where all points lie on a line sloping upwards.
- Perfect negative correlation: A correlation coefficient of -1 where all points lie on a line sloping downwards.
- No correlation: A correlation coefficient of 0 where points are widely scattered and depict no pattern.
Data Transformation
- Data transformation can be used to achieve linearity in the correlation, such as square root transformation, log transformation, etc.
- Only apply transformations to data where the correlation is not linear when plotted in its original form.
Causality
- While bivariate statistics can identify relationships between variables, they do not necessarily demonstrate cause and effect.
- Be wary of the potential for confounding variables to inaccurately suggest a direct relationship between two variables.