Scattergraphs
Scattergraphs
Introduction to Scattergraphs
- Scattergraphs, also known as scatter plots or scatter diagrams, are used to investigate the relationship between two numerical variables.
- Each point on the scattergraph represents one observation from the data set, and its position along the X and Y axes represents its values for the two variables.
- Scattergraphs can be used to visually check for trends, correlations, and outliers in your data.
- No line or curve is drawn connecting the points, because there is not necessarily a linear or curved relationship between the variables. The pattern of the points is what we are interested in.
Creating a Scattergraph
- When creating a scattergraph, typically the independent variable is on the X-axis and the dependent variable is on the Y-axis.
- Make sure to clearly label your axes with the variables they represent, and include a scale.
- Start by plotting each data point on the graph, marking the intersection of the values for the two variables.
- You can then visually inspect the graph to look for patterns among the points.
Understanding Scattergraphs
- Scattergraphs give a visual representation of the correlation between two data sets.
- A positive correlation exists when as one variable increases, so does the other. The points will tend to go from the bottom left of the graph to the top right.
- A negative correlation exists when as one variable increases, the other decreases. The points will tend to go from the top left of the graph to the bottom right.
- A no correlation situation is when there is no relationship between the variables. The points will be scattered randomly on the graph.
- Scattergraphs cannot prove causation, only correlation. Just because two variables move together, it doesn’t mean that one causes the other to change.
Lines of Best Fit
- A line of best fit can be drawn on a scattergraph to show the direction of the correlation, if one exists. This is typically a straight line, but could be a curve if the data suggests a nonlinear relationship.
- When drawing a line of best fit, try to have as many points above the line as below it, and try to minimise the total distance of all points from the line.
- The closer the data points fall to the line of best fit, the stronger the correlation.
- You can use your line of best fit to make predictions outside the range of data you have, but this can be less reliable and is called extrapolation.
Cautions with Scattergraphs
- Beware of outliers, which are points that do not fit the overall pattern of the data. They can greatly affect the correlation and your line of best fit.
- Scattergraphs can sometimes be misleading. For instance, seeing a random collection of points can suggest a lack of relationship between the variables when in reality, non-linear or more complex relationships might exist.
- Finally, correlation does not always mean causation. Always consider other factors that may affect the variables in the scattergraph.