Measures of Dispersion

Measures of Dispersion

Measures of dispersion provide a way to understand the spread or variability in a data set.

Range

  • The range shows the difference between the highest and the lowest values in a data set.
  • It provides a simple measure of the full spread of values.
  • However, it is sensitive to outliers and doesn’t give any information about the distribution of values within the set.

Interquartile Range (IQR)

  • The interquartile range measures the spread of the middle 50% of values.
  • It is calculated by subtracting the lower quartile (Q1) from the upper quartile (Q3).
  • The IQR is not affected by outliers, making it more reliable than the range for data with extreme values.

Variance

  • Variance measures how far each number in the set is from the mean.
  • It squares the deviations from the mean, which prevents cancelling out of positive and negative deviations.
  • Since it uses squared units, it can be difficult to interpret standalone, hence the use of standard deviation.

Standard Deviation

  • The standard deviation is the square root of variance and so expresses the spread of data in the same units as the data itself.
  • A low standard deviation means that the values are close to the mean, while a high standard deviation shows that the values are spread out over a wider range.
  • It’s the most complex measure of dispersion, but provides the full picture of variability.

Choosing the appropriate measure

  • Range is best used for small, simple datasets or for a rough estimate of spread.
  • IQR is useful when we have skewed data or outliers.
  • Variance is useful for comparing the spread between different data sets.
  • The standard deviation is most useful in cases where we are describing normal distributions, or where precision is important.
  • As with measures of central tendency, the choice of measure will depend on the nature of the data and what we are trying to find out from it.