Measures of Central Tendency and Dispersion
Measures of Central Tendency and Dispersion
Measures of Central Tendency
-
Mean: The mean is calculated as the sum of all values divided by the number of values. It provides an overall estimation of the data but can be heavily influenced by outliers. For example, to calculate the mean age of a group of people, you would add together all their ages and then divide by the number of people.
-
Median: The median is the middle value when all values are arranged in ascending or descending order. If there is an even number of values, the median is the mean of the two middle numbers. Median is a useful measure when dealing with skewed or unbalanced distributions, as it is not influenced by outliers.
-
Mode: The mode is the value that appears most frequently in a data set. A set may have one mode, more than one mode, or no mode at all. The mode can offer valuable information about the most common occurrence within a data set.
Measures of Dispersion
-
Range: The range is calculated as the difference between the highest and the lowest value in the set. The range is a simple measure of the total spread of values, but it doesn’t provide any information about the distribution of values between the extremes.
-
Interquartile Range (IQR): IQR is the range covered by the middle 50% of the data. It’s calculated by subtracting the lower quartile (25th percentile) from the upper quartile (75th percentile). IQR is useful for identifying outliers and is not influenced by them.
-
Variance: Variance measures the dispersion of data points from the mean. It is the average of the squared differences from the mean. A high variance indicates that the data points are very spread out from the mean, and from one another.
-
Standard Deviation: This is the square root of the variance. A low standard deviation indicates that data points tend to be close to the mean, while a high standard deviation suggests that data are spread out over a wider range.
Always remember that measures of central tendency and dispersion provide different insights into your data set and are most informative when used together.