Histograms

Understanding Histograms

  • Histograms are graphical displays of data which use rectangles to represent the frequency of data items in successive numerical intervals of equal size.
  • The horizontal axis represents the values of the variables. This is also known as the ‘base’ of the rectangles.
  • The vertical axis represents the frequency of the variables. The area of each rectangle is proportional to its corresponding frequency.
  • Height of Rectangle in histogram represents the frequency density. Frequency density is calculated by Frequency ÷ Class Width.
  • Class widths in a histogram do not always have to be the same.
  • The ‘bins’ or ‘class intervals’ represent the scale of values on x-axis.
  • No gaps are left between bars in a histogram as it displays continuous data.

Analyzing and Interpreting Histograms

  • Histograms are useful for interpreting and analyzing data distribution.
  • Look for symmetry, peaks (modes), skewness and outliers in a histogram to determine distribution type.
  • The peak of a histogram shows the most common value(s), also known as the mode.
  • If a histogram is symmetrical, it is likely to show a normal distribution.
  • The area in a histogram can be used to determine the probability that a variable falls within a specific range.

Constructing Histograms

  • When constructing histograms, identify the range (max - min) and decide the class interval or width.
  • Calculate the Frequency Density = Frequency ÷ Class Width for each class interval.
  • Plot the class interval (or ‘bin’) on the x-axis and the frequency density on the y-axis.
  • Draw rectangles with width equal to the class interval and height equal to the frequency density.
  • It’s useful to include a title, labels and a key when drawing histograms.

Limitations of Histograms

  • Histograms can provide a general idea about the distribution of data, but detailed comparisons are difficult.
  • Outliers may not be visible if the range of data is large.
  • The appearance of a histogram can be affected greatly by the number of rectangles or class intervals used.