Probability Generating Functions
-
Probability Generating Functions (PGFs) are a convenient way of representing a probability distribution, particularly for discrete random variables. They provide a compact way to identify and manipulate the probabilities.
-
The PGF of a discrete random variable X is defined as Gx(s) = E(s^X), the expected value of s raised to the power of X, where -1≤s≤1.
-
The PGF of a discrete random variable with possible values x1, x2, …, xn and respective probabilities p1, p2, …, pn is Gx(s)=p1s^x1 + p2s^x2 + … + pns^xn.
-
The coefficient of s^n in the expansion of a PGF gives the probability that the random variable equals n, P(X = n).
-
PGFs can be used to find expected values and variances of random variables. The expected value E(X) is the first derivative of the function evaluated at s = 1, and E(X(X - 1)) equals the second derivative of Gx(s) evaluated at s = 1.
-
The variance Var(X) of a random variable X can be found using the formula E(X^2) - [E(X)]^2. The second term is the square of the expected value, which can be found from the first derivative. The first term is E(X(X - 1)) + E(X), found from the second derivative and the first derivative.
-
The PGF of the sum of independent random variables is the product of their individual PGFs. If X and Y are independent random variables, then the PGF of Z = X + Y is Gz(s) = Gx(s) x Gy(s).
-
Different random variables have different forms of PGFs. For example, the PGF of a binomial random variable with parameters n and p is Gx(s) = (p*s + 1 - p)^n.
-
The PGF of the geometric distribution with parameter p is Gx(s) = p(s)/(1 - s(1-p)), 0 < s < 1/p.
-
By comparing PGFs, you can identify the underlying distribution of a random variable. Identifying the form can show whether a random variable has a binomial, geometric, etc. distribution.
-
Be aware of common manipulations in PGFs, like shifting the variable, stretching/shrinking or multiplying by other PGFs. These manipulations lead to different distributions.
-
Importance: The PGFs makes calculations of expected values and variances structurally easier and systematic. It is also a useful tool to solve problems related to the sum of independent random variables.