# Scattergraphs

# Scattergraphs

## Introduction to Scattergraphs

**Scattergraphs**, also known as scatter plots or scatter diagrams, are used to investigate the relationship between two numerical variables.- Each point on the scattergraph represents one observation from the data set, and its position along the X and Y axes represents its values for the two variables.
- Scattergraphs can be used to visually check for trends, correlations, and outliers in your data.
- No line or curve is drawn connecting the points, because there is not necessarily a linear or curved relationship between the variables. The pattern of the points is what we are interested in.

## Creating a Scattergraph

- When creating a scattergraph, typically the
**independent variable**is on the X-axis and the**dependent variable**is on the Y-axis. - Make sure to clearly label your axes with the variables they represent, and include a scale.
- Start by plotting each data point on the graph, marking the intersection of the values for the two variables.
- You can then visually inspect the graph to look for patterns among the points.

## Understanding Scattergraphs

- Scattergraphs give a visual representation of the correlation between two data sets.
- A
**positive correlation**exists when as one variable increases, so does the other. The points will tend to go from the bottom left of the graph to the top right. - A
**negative correlation**exists when as one variable increases, the other decreases. The points will tend to go from the top left of the graph to the bottom right. - A
**no correlation**situation is when there is no relationship between the variables. The points will be scattered randomly on the graph. - Scattergraphs cannot prove causation, only correlation. Just because two variables move together, it doesn’t mean that one causes the other to change.

## Lines of Best Fit

- A
**line of best fit**can be drawn on a scattergraph to show the direction of the correlation, if one exists. This is typically a straight line, but could be a curve if the data suggests a nonlinear relationship. - When drawing a line of best fit, try to have as many points above the line as below it, and try to minimise the total distance of all points from the line.
- The closer the data points fall to the line of best fit, the stronger the correlation.
- You can use your line of best fit to make predictions outside the range of data you have, but this can be less reliable and is called
**extrapolation**.

## Cautions with Scattergraphs

- Beware of
**outliers**, which are points that do not fit the overall pattern of the data. They can greatly affect the correlation and your line of best fit. - Scattergraphs can sometimes be misleading. For instance, seeing a random collection of points can suggest a lack of relationship between the variables when in reality, non-linear or more complex relationships might exist.
- Finally, correlation does not always mean causation. Always consider other factors that may affect the variables in the scattergraph.