# The distribution of X and the central limit theorem

## Distribution of X

• Random variables are variables linked to a random event. The distribution of X describes the possible outcomes of the random variable X, along with their associated probabilities.

• The probability distribution of a variable is detailed by the probability function, which provides the probabilities of discrete outcomes, or the probability density function for continuous variables.

• A discrete probability distribution or probability mass function (pmf) lists the exact probabilities of discrete outcomes, while the probability density function (pdf) provides probabilities for ranges of outcomes for continuous outcomes.

• The expected value, E(X), of a random variable X is computed as the sum of the product of each outcome and its associated probability for discrete variables, or integrated over the range for continuous variables.

• The variance, Var(X), measures the dispersion of the random variable from its expected value. It is calculated as the expected value of the squared deviation from the mean.

• The standard deviation is the square root of the variance, providing a measure of dispersion in the same units as the random variable.

## Central Limit Theorem

• The Central Limit Theorem (CLT) is a fundamental theorem in probability theory and statistics which states that the distribution of the sum (or average) of a large number of independent, identically distributed random variables approaches a normal distribution, regardless of the shape of the original distribution.

• This theorem has immense importance due to the ubiquity of the Normal Distribution in applied science, engineering, mathematics and natural science.

• The central limit theorem explains why many distributions tend to be close to the normal distribution. The key factor is that the average of a large number of variables, irrespective of the original distribution, follows a normal distribution.

• Only two critical conditions are required for the theorem to hold: The random variables must be identically distributed, and they must be independent of each other.

• The standard normal distribution, also known as the z-distribution, is a special case of the normal distribution where the mean is 0 and the standard deviation is 1.

• The concept of the central limit theorem is the logical foundation for many statistical procedures, including hypothesis testing and confidence intervals, which both assume normal distribution.

• Practically, it allows us to make significant inferences about population parameters using sample data.