Bivariate Statistics

Overview of Bivariate Statistics

  • Bivariate statistics analyses the relationship between two variables. It is an essential part of statistical methodology.
  • This method determines the degree of correlation between the two variables, the direction of this correlation, and its graphical representation.

Key Concepts

  • Scatterplot: A graphical representation of the degree and direction of correlation between two variables.
  • Line of Best Fit/Regresion Line: A line which depicts the trend observed in the scatterplot.
  • Correlation coefficient (r) or (rho): A numerical value typically ranging from -1 to 1 representing the degree of correlation between two variables.

Drawing Scatter Plots and Lines of Best Fit

  • A scatterplot is prepared with the independent variable on the horizontal axis and the dependent variable on the vertical axis.
  • Each data point is represented by a dot on the scatterplot.
  • The line of best fit is drawn in such a way that it minimizes the distance between the line and all the plotted points.

Calculating the Correlation Coefficient

  • Positive correlation: As one variable increases, so does the other resulting in a correlation coefficient between 0 and 1.
  • Negative correlation: As one variable increases, the other decreases resulting in a correlation coefficient between 0 and -1.
  • A correlation coefficient near to 0 suggests a weak or non-existent relationship between variables.

Interpreting the Correlation Coefficient

  • A positive correlation coefficient signifies direct proportionality between variables, while a negative one shows inverse proportionality.
  • The closer the absolute value of the correlation coefficient to 1, the stronger the relationship between variables.
  • A correlation coefficient of 0 signifies no linear relationship between the variables.

Types of Correlation

  • Perfect positive correlation: A correlation coefficient of 1 where all points lie on a line sloping upwards.
  • Perfect negative correlation: A correlation coefficient of -1 where all points lie on a line sloping downwards.
  • No correlation: A correlation coefficient of 0 where points are widely scattered and depict no pattern.

Data Transformation

  • Data transformation can be used to achieve linearity in the correlation, such as square root transformation, log transformation, etc.
  • Only apply transformations to data where the correlation is not linear when plotted in its original form.

Causality

  • While bivariate statistics can identify relationships between variables, they do not necessarily demonstrate cause and effect.
  • Be wary of the potential for confounding variables to inaccurately suggest a direct relationship between two variables.