Unbiased estimates of population mean and variance

Unbiased Estimates

An unbiased estimate is a statistic used to estimate a population parameter whose expected value is equal to the parameter being estimated.
In simple terms, an estimator is unbiased if it doesn’t systematically overestimate or underestimate the parameter it is estimating.

Estimating the Population Mean

Suppose that a random sample of size n is taken from a population. The sample mean, denoted by x-bar (x̄), is an unbiased estimate of the population mean, μ.
Formally, this means that the expected value of x̄ is μ: E(x̄) = μ.
This is because x̄ is calculated as the sum of the sample values divided by the number of values, which is essentially the ‘average’ of the values.

Estimating the Population Variance

The sample variance, denoted by s², is a commonly used estimate for the population variance, σ².
However, the naive calculation of the sample variance (the sum of squared differences from the sample mean divided by the number of samples) is typically a biased estimate. It tends to underestimate the true population variance.
To make it unbiased, we divide by n-1 instead of n. This practice is known as Bessel’s correction.
Thus, the unbiased estimator of the population variance, s², is given by s² = Σ(xi - x̄)²/(n - 1).

Properties of Unbiased Estimates

An important feature of unbiased estimates is their consistency. A consistent estimator is one whose value gets closer to the parameter being estimated as the sample size increases.
Both the sample mean (x̄) and corrected sample variance (s²) are consistent estimators, meaning they get more accurate as more data is collected.
While unbiasedness is a valuable property, it is not the only important feature of a good estimator. The precision of the estimator, often measured by its variance or standard error, is equally important.
A trade-off often exists between bias and variance in estimator selection, known as the bias-variance trade-off.

The Central Limit Theorem (CLT) states that if you have a population with mean μ and standard deviation σ, and take sufficiently large random samples from the population with replacement, then the distribution of the sample means will be approximately normally distributed.
The CLT is a key theoretical foundation for the use of the sample mean as an unbiased estimator for the population mean. It ensures that the distribution of sample means will converge on the true population mean as sample size increases.

Review Questions

After getting familiar with these concepts, try and solve practice questions around unbiased estimates for population mean and variance to check your understanding. Assess your ability to calculate sample mean and variance and interpret those values in the context of the problem. Practice problems can be found in your statistics textbook and online resources.