Descriptive Statistics
Basics of Descriptive Statistics
- Descriptive statistics provides a summary and analysis of a data collected from an experiment or study.
- It is divided into two broad categories: measures of central tendency and measures of dispersion or variability.
Measures of Central Tendency
- Mean: It is the average of the data set and it’s calculated by adding all data points and dividing by the number of data points.
- Median: The median is the middle value that separates the higher half from the lower half of a data set. If the data set has an even number of observations, the median is the average of the two middle numbers.
- Mode: The mode is the value that appears most frequently in a data set.
Measures of Dispersion or Variability
- Range: The range of a data set is the spread, which is the difference between the highest and lowest values.
- Variance: Variance measures how far each number in the set is from the mean (or expected value).
- Standard deviation: The standard deviation is the square root of the variance and provides a measure of the amount of variation or dispersion of a set of values.
Visual Tools for Descriptive Statistics
- Histograms and bar graphs: These are visual representations of numerical data divided into bins or classes.
- Box-and-whisker plots: These plots visually show the median, lower and upper quartiles, and any possible outliers in the data.
- Scatter plots: These plots can visually display possible correlations between two different data sets.
Rules & Principles
- Use descriptive statistics to summarize and interpret data but it does not allow you to make conclusions about the population that the data are taken from.
- Outliers can significantly affect the mean and the standard deviation, but have less impact on the median or mode.
Importance & Relevance
- Descriptive statistics provide clarity to large amounts of data by reducing lots of data into a simpler summary.
- These statistics help us understand and describe the features of a specific data set, providing an informative summary of the measure.