Confidence intervals

Confidence Intervals: Understanding the Basics

  • Confidence intervals are a range of values that are used to estimate the true population parameter. They provide a measure of uncertainty around the point estimate of a statistic.
  • A confidence level is the probability that the true value of a parameter is within a specified range, this is often expressed as a percentage (such as 95% or 99%).
  • The margin of error is a measure of the uncertainty around the point estimate, and it defines the range in which the true value of the parameter is believed to lie. It’s half the width of the confidence interval.
  • A point estimate is a single value estimate for a population parameter. The best point estimate of the population mean (\mu) is the sample mean (\overline{x}).

Calculating Confidence Intervals

  • For large sample sizes (n>30) or when the population standard deviation (σ) is known, the z-distribution can be used to calculate confidence intervals, using the formula:

    [ CI = \overline{x} ± z * \frac{σ}{√n} ]

    Where, \overline{x} is the sample mean, z is the z-score for the chosen level of confidence, σ is the standard deviation and n is the sample size.

  • For small sample sizes (n<30) and when the population standard deviation is unknown, the t-distribution is used. The formula is similar to the one above but the z-score is replaced with a t-score.

    [ CI = \overline{x} ± t * \frac{s}{√n} ]

    Here, s is the sample standard deviation and t is the t-score for the chosen level of confidence and degrees of freedom (df = n-1).

Interpreting Confidence Intervals

  • If a confidence interval includes the hypothesized value, we fail to reject the null hypothesis at the given level of confidence. In contrast, if the confidence interval does not contain the hypothesized value, we reject the null hypothesis.
  • Narrower confidence intervals provide more precise estimates of the population parameter but they’re more likely to exclude the true value of the parameter.
  • A higher confidence level leads to a wider confidence interval, providing a more conservative estimate of the population parameter.

Limitations of Confidence Intervals

  • A common misconception is that a 95% confidence interval implies that there is a 95% chance that the true value lies within the interval. This is incorrect - the confidence level is about the method used to construct the interval, not the probability that the parameter is in the interval.
  • Confidence intervals are based on the assumption of sampling from a normally distributed population. If this assumption is not met, the confidence intervals may not be accurate.
  • They also assume the sample is a simple random sample representative of the population. If there are biases in how the sample was collected, the confidence interval will not be accurate.