Measures of Dispersion
Measures of Dispersion
Measures of dispersion provide a way to understand the spread or variability in a data set.
Range
- The range shows the difference between the highest and the lowest values in a data set.
- It provides a simple measure of the full spread of values.
- However, it is sensitive to outliers and doesn’t give any information about the distribution of values within the set.
Interquartile Range (IQR)
- The interquartile range measures the spread of the middle 50% of values.
- It is calculated by subtracting the lower quartile (Q1) from the upper quartile (Q3).
- The IQR is not affected by outliers, making it more reliable than the range for data with extreme values.
Variance
- Variance measures how far each number in the set is from the mean.
- It squares the deviations from the mean, which prevents cancelling out of positive and negative deviations.
- Since it uses squared units, it can be difficult to interpret standalone, hence the use of standard deviation.
Standard Deviation
- The standard deviation is the square root of variance and so expresses the spread of data in the same units as the data itself.
- A low standard deviation means that the values are close to the mean, while a high standard deviation shows that the values are spread out over a wider range.
- It’s the most complex measure of dispersion, but provides the full picture of variability.
Choosing the appropriate measure
- Range is best used for small, simple datasets or for a rough estimate of spread.
- IQR is useful when we have skewed data or outliers.
- Variance is useful for comparing the spread between different data sets.
- The standard deviation is most useful in cases where we are describing normal distributions, or where precision is important.
- As with measures of central tendency, the choice of measure will depend on the nature of the data and what we are trying to find out from it.