1. Virtual Laboratories
2. 4. Special Distributions
3. The Gamma Distribution

## The Gamma Distribution

In this section we will study a family of distributions that has special importance in probability statistics. In particular, the arrival times in the Poisson process have gamma distributions, and the chi-square distribution is a special case of the gamma distribution.

#### The Gamma Function

The gamma function, first introduced by Leonhard Euler, is defined as follows

$\Gamma(k) = \int_0^\infty s^{k-1} e^{-s} \, ds, \quad k \in (0, \infty)$

The gamma function is well defined, that is, the integral in the gamma function converges for any $$k \gt 0$$.

Proof:

Write the integral as the sum of the integral over $$(0, 1]$$ and the integral over $$(1, \infty)$$. For the first integral in (a), $$s^{k-1}$$ is the important factor. For the second integral in (a), $$e^{-s}$$ is the important factor.

The graph of the gamma function on the interval $$(0, 5)$$ is shown below:

$$\Gamma(k + 1) = k \, \Gamma(k)$$ for any $$k \gt 0$$.

Proof:

Integrate by parts.

$$\Gamma(k) = (k - 1)!$$ if $$k \in \N_+$$.

Proof:

Use Exercise 2 and the fact that $$\Gamma(1) = 1$$.

$$\Gamma(\frac{1}{2}) = \sqrt{\pi}$$.

Proof:

One of the most famous asymptotic formulas is Stirling's formula, named for James Stirling:

$\Gamma(x + 1) \approx \left( \frac{x}{e} \right)^x \sqrt{2 \, \pi \, x} \text{ as } x \to \infty$

Thus, in particular, it follows that

$n! \approx \left( \frac{n}{e} \right)^n \sqrt{2 \, \pi \, n} \text{ as } n \to \infty$

The function $$f$$ below is a probability density function for any $$k \gt 0$$.

$f(x) = \frac{1}{\Gamma(k)} x^{k-1} e^{-x}, \quad 0 \lt x \lt \infty$

A random variable $$X$$ with this probability density function is said to have the gamma distribution with shape parameter $$k$$. The following exercise shows that the family of densities has a rich variety of shapes, and shows why $$k$$ is called the shape parameter.

The gamma probability density function satisfies the following properties:

1. If $$0 \lt k \lt 1$$ then $$f$$ is decreasing with $$f(x) \to \infty$$ as $$x \downarrow 0$$.
2. If $$k = 1$$ then $$f$$ is decreasing with $$f(0) = 1$$.
3. If $$k \gt 1$$ the $$f$$ increases on the interval $$(0, k - 1)$$ and decreases on the interval $$(k - 1, \infty)$$.

The special case $$k = 1$$ gives the standard exponential distribuiton. When $$k \ge 1$$, the distribution is unimodal with mode $$k - 1$$.

In the simulation of the special distribution simulator, select the gamma distribution. Vary the shape parameter and note the shape of the density function. For various values of $$k$$, run the simulation 1000 times and watch the apparent convergence of the empirical density function to the true probability density function.

Suppose that the lifetime of a device (in 100 hour units) has the gamma distribution with shape parameter $$k = 3$$. Find the probability that the device will last more than 300 hours.

$$\P(X \gt 3) = \frac{17}{2} e^{-3} \approx 0.432$$

The distribution function and the quantile function do not have simple, closed representations. Approximate values of these functions can be obtained from special distribution calculator, and from most mathematical and statistical software packages.

Using the special distribution calculator, find the median, the first and third quartiles, and the interquartile range in each of the following cases:

1. $$k = 1$$
2. $$k = 2$$
3. $$k = 3$$

The following exercise gives the mean and variance of the gamma distribution.

If $$X$$ has the gamma distribution with shape parameter $$k$$ then

1. $$\E(X) = k$$
2. $$\var(X) = k$$

More generally, the moments can be expressed easily in terms of the gamma function:

If $$X$$ has the gamma distribution with shape parameter $$k$$ then

1. $$\E(X^n) = \Gamma(n + k) / \Gamma(k)$$ for $$n \ge 0$$
2. $$\E(X^n) = n^{(k)} = n (n - 1) \cdots (n - k + 1)$$ if $$n \in \N$$

The following exercise gives the moment generating function.

If $$X$$ has the gamma distribution with shape parameter $$k$$ then

$\E(e^{t \, X}) = \frac{1}{(1 - t)^k}, \quad t \lt 1$

In the simulation of the special distribution simulator, select the gamma distribution. Vary the shape parameter and note the size and location of the mean/standard deviation bar. For selected values of $$k$$, run the simulation 1000 times and note the apparent convergence of the empirical moments to the distribution moments.

Suppose that the length of a petal on a certain type of flower (in cm) has the gamma distribution with shape parameter $$k = 4$$. Give the mean and standard deviation of the petal length.

Let $$X$$ denote the petal length in centimeters.

1. $$\E(X) = 4$$
2. $$\sd(X) = 2$$

#### The General Gamma Distribution

The gamma distribution is usually generalized by adding a scale parameter. Thus, if $$Z$$ has the basic gamma distribution with shape parameter $$k$$, as defined above, then for $$b \gt 0$$, $$X = b \, Z$$ has the gamma distribution with shape parameter $$k$$ and scale parameter $$b$$. The reciprocal of the shape parameter, $$r = 1 / b$$ is known as the rate parameter, particularly in the context of the Poisson process. The gamma distribution with parameters $$k = 1$$ and $$b$$ is called the exponential distribution with scale parameter $$b$$ (or rate parameter $$r = 1 / b$$). More generally, when the shape parameter $$k$$ is a positive integer, the gamma distribution is known as the Erlang distribution, named for the Danish mathematician Agner Erlang. The exponential distribution governs the time between arrivals in the Poisson model, while the Erlang distribution governs the actual arrival times.

Analogies of the results given above follow easily from basic properties of the scale transformation.

If $$X$$ has the gamma distribution with shape parameter $$k$$ and scale parameter $$b$$ then $$X$$ has probability density function

$f(x) = \frac{1}{\Gamma(k) \, b^k} x^{k-1} e^{-x/b}, \quad 0 \lt x \lt \infty$

Recall that the inclusion of a scale parameter does not change the shape of the density function, but simply scales the graph horizontally and vertically. In particular, we have the same basic shapes as given in Exercise 6.

The gamma distribution with shape parameter $$k \ge 1$$ and scale parameter $$b$$ is unimodal with mode at $$x = (k - 1) b$$.

Suppose that $$X$$ has the gamma distribution with shape parameter $$k$$ and scale parameter $$b$$. Then

1. $$\E(X) = k \, b$$
2. $$\var(X) = k \, b^2$$

Suppose that $$X$$ has the gamma distribution with shape parameter $$k$$ and scale parameter $$b$$. Then

1. $$\E(X^n) = b^n \Gamma(n + k) / \Gamma(k)$$ for $$n \gt 0$$
2. $$\E(X^n) = b^n n^{(k)} = b^n n (n - 1) \cdots (n - k + 1)$$ if $$n \in \N$$

Suppose that $$X$$ has the gamma distribution with shape parameter $$k$$ and scale parameter $$b$$. The moment generating function of $$X$$ is given by

$\E(e^{t \, X}) = \frac{1}{(1 - b \, t)^k}, \quad t \lt \frac{1}{b}$

In the special distribution simulator, select the gamma distribution. Vary the parameters and note the shape and location of the density function and the mean/standard deviation bar. For selected values of the parameters, run the simulation 1000 times and watch the apparent convergence of the empirical density and moments to the true probability density and moments..

Suppose that the lifetime of a device (in hours) has the gamma distribution with shape parameter $$k = 4$$ and scale parameter $$b = 100$$.

1. Find the probability that the device will last more than 300 hours.
2. Find the mean and standard deviation of the lifetime.

Let $$X$$ denote the lifetime in hours.

1. $$\P(X \gt 300) = 13 e^{-3} \approx 0.6472$$
2. $$\E(X) = 400$$, $$\sd(X) = 200$$

#### Transformations

Our first transformation is simply a restatement of the meaning of the term scale parameter.

Suppose that $$X$$ has the gamma distribution with shape parameter $$k$$ and scale parameter $$b$$. If $$c \gt 0$$, then $$c \, X$$ has the gamma distribution with shape parameter $$k$$ and scale parameter $$b \, c$$.

More importantly, if the scale parameter is fixed, the gamma family is closed with respect to sums of independent variables.

Suppose that $$X_1$$ and $$X_2$$ are independent random variables, and that $$X_i$$ has the gamma distribution with shape parameter $$k_i$$ and scale parameter $$b$$ for $$i \in \{1, 2\}$$. Then $$X_1 + X_2$$ has the gamma distribution with shape parameter $$k_1 + k_2$$ and scale parameter $$b$$.

Proof:

Use moment generating functions.

Suppose that $$X$$ has the gamma distribution with shape parameter $$k$$ and scale parameter $$b$$. Then the distribution is a two-parameter exponential family with natural parameters $$(k - 1, 1 / b)$$, and natural statistics $$(X, \ln(X))$$.

#### Normal Approximation

From the Exercise 23, it follows that if $$Y_k$$ has the gamma distribution with shape parameter $$k \in \N_+$$ and fixed scale parameter $$b$$, then

$Y_k = \sum_{i=1}^k X_i$

where $$(X_1, X_2, \ldots)$$ is a sequence of independent random variable, each with the exponential distribution with parameter $$b$$. It follows from the central limit theorem that if $$k$$ is large (and not necessarily integer), the gamma distribution can be approximated by the normal distribution with mean $$k \, b$$ and variance $$k \, b^2$$. More precisely, the distribution of the standardized variable below converges to the standard normal distribution as $$k \to \infty$$.

$Z_k = \frac{Y_k - k \, b}{\sqrt{k} \, b}$

In the special distribution simulator, select the gamma distribution. Vary $$k$$ and $$b$$ and note the shape of the density function. For selected values of the parameters, run the experiment 1000 times and note the apparent convergence of the empirical density function to the true probability density function.

Suppose that $$Y$$ has the gamma distribution with parameters $$k = 10$$ and $$b = 2$$. For each of the following, compute the true value using the special distribution calculator and then compute the normal approximation. Compare the results.

1. $$\P(18 \lt X \lt 25)$$
2. The 80th percentile of $$Y$$
1. $$\P(18 \lt X \lt 25) = 0.3860$$, $$\P(18 \lt X \lt 25) \approx 0.4095$$
2. $$y_{0.8} = 25.038$$, $$y_{0.8} \approx 25.325$$