Estimation, Confidence intervals and tests using a normal distribution

Estimation, Confidence intervals and tests using a normal distribution

  • Estimation refers to the process used to guess the population parameters, such as mean and standard deviation, based on a sample. The two main types of estimators in statistics are point estimators and interval estimators.

  • Point estimation gives a single value as the estimate of population parameter. The value of a point estimator is a sample statistic. The common point estimators are sample mean, sample variance, sample standard deviation, sample proportion etc.

  • Interval estimation, on the other hand, gives a range of values (an interval) as the estimate of population parameter. The interval is likely to include the population parameter with a certain level of confidence.

  • Confidence intervals: These are a range of values so defined that there is a specific probability that the value of a parameter lies within it. It is widely used in inferential statistics where it’s impossible to get the complete information about a population.

  • A confidence interval consist of three parts: a confidence level, a statistic, and a margin of error. The confidence level describes the uncertainty of a sampling method — often defined as a 95% level. The statistic and margin of error define the range for the confidence interval.

  • Normal distribution tests are based on the assumption that the data used follows a normal distribution. This test is used to determine if data set is well-modelled by a normal distribution and to compute how likely it is for a random variable underlying the data set to be normally distributed.

  • The three tests of normality often used are the Anderson-Darling Test, the Shapiro-Wilk Test, and the Lilliefors Test.

  • One of the more common statistical tests involving a normal distribution is a z-test. A z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution.

  • Another test might include t-test, a type of inferential statistic which is used to determine if there is a significant difference between the means of two groups which may be related in certain features. The t-test assumes that the data follows a normal distribution and that the variances are equal.

  • It’s important to understand the assumptions behind these tests and to check whether the assumptions hold for a specific data set before using them. Violating these assumptions can lead to inaccurate or misleading results.