Box Plots

Understanding Box Plots

  • A Box Plot, also known as a box and whisker plot, is a statistical graph that presents information about the distribution of a data set.
  • It provides a summary of the five-number data summary: minimum, lower quartile, median, upper quartile, and maximum.
  • A box plot uses a rectangular box to represent the central 50% of the data, named the Inter Quartile Range (IQR), and lines, referred to as whiskers, to indicate variability outside this range.

Constructing a Box Plot

  • To construct a Box Plot, start by computing the median of the data set, which divides it into two halves.
  • The lower quartile (Q1) is the median of the first half and the upper quartile (Q2) is the median of the second half.
  • The range between the lower and the upper quartile is the interquartile range (IQR). It shows the middle 50% of scores when ordered from lowest to highest.
  • Draw a box connecting Q1 and Q3. Inside this box, draw a line at the median (Q2).
  • The minimum value (lowest observation) and the maximum value (highest observation) are represented by lines extending from the box, known as whiskers.
  • Where the whiskers end (maximum and minimum values) doesn’t necessarily mean there are data points there, it just shows the range of the data.

Interpreting a Box Plot

  • The length of the box represents the IQR showing the spread of the central 50% of the data. A larger box would indicate greater variability in the central scores of the data.
  • The line through the box represents the median and provides a measure of data’s central tendency.
  • Outliers, if any, are usually plotted as individual dots that are in line with the whiskers but outside of their span.
  • The position and length of the whiskers can give an impression of the symmetry of the data distribution.
  • A longer whisker on one side of the box would suggest quite an asymmetrical distribution skewed to the side of that whisker.

Examples of Box Plot

  • Consider a data set for test scores: 34, 56, 57, 58, 60, 61, 62, 63, 85.
  • Here, the minimum value is 34, lower quartile (Q1) is 56, median is 60, upper quartile (Q3) is 63, and the maximum value is 85.
  • Plotting these values would result in a box plot. Students can use this graph to visualize the spread of test scores, where the median score lies, and whether there are any potential outliers.
  • The construction and interpretation of box plots are important skills in the statistics section of higher-level mathematics.