\(\newcommand{\R}{\mathbb{R}}\)
\(\newcommand{\N}{\mathbb{N}}\)
\(\newcommand{\E}{\mathbb{E}}\)
\(\newcommand{\P}{\mathbb{P}}\)
\(\newcommand{\var}{\text{var}}\)
\(\newcommand{\sd}{\text{sd}}\)
\(\newcommand{\cov}{\text{cov}}\)
\(\newcommand{\cor}{\text{cor}}\)
\(\newcommand{\skew}{\text{skew}}\)
\(\newcommand{\kurt}{\text{kurt}}\)

- Random
- 4. Special Distributions
- The Lognormal Distribution

Random variable \(X\) has the lognormal distribution with parameters \(\mu \in \R\) and \(\sigma \in (0, \infty)\) if \(\ln(X)\) has the normal distribution with mean \(\mu\) and standard deviation \(\sigma\). The parameter \( \sigma \) is the shape parameter of \( X \) while \( e^\mu \) is the scale parameter of \( X \).

Equivalently, \(X = e^{Y}\) where \(Y\) is normally distributed with mean \(\mu\) and standard deviation \(\sigma\). We can write \( Y = \mu + \sigma Z \) where \( Z \) has the standard normal distribution. Hence we can write \[ X = e^{\mu + \sigma Z} = e^\mu \left(e^Z\right)^\sigma \] Random variable \( e^Z \) has the lognormal distribution with parameters 0 and 1, and naturally enough, this is the standard lognormal distribution. The lognormal distribution is used to model continuous random quantities when the distribution is believed to be skewed, such as certain income and lifetime variables.

The probability density function of the lognormal distribution with parameters \(\mu\) and \(\sigma\) is given by \[ f(x) = \frac{1}{\sqrt{2 \pi} \sigma x} \exp \left(-\frac{\left[\ln(x) - \mu\right]^2}{2 \sigma^2} \right), \quad x \in (0, \infty) \]

- \( f \) increases and then decreases with mode at \( x = \exp\left(\mu - \sigma^2\right) \).
- \( f \) is concave upward then downward then upward again, with inflection points at \( x = \exp\left(\mu - \frac{3}{2} \sigma^2 \pm \frac{1}{2} \sigma \sqrt{\sigma^2 + 4}\right) \)
- \( f(x) \to 0 \) as \( x \downarrow 0 \) and as \( x \to \infty \).

The form of the PDF follows from the change of variables theorem. Let \( g \) denote the PDF of the normal distribution with mean \( \mu \) and standard deviation \( \sigma \), so that \[ g(y) = \frac{1}{\sqrt{2 \pi} \sigma} \exp\left[-\frac{1}{2}\left(\frac{y - \mu}{\sigma}\right)^2\right], \quad y \in \R \] The mapping \( x = e^y \) maps \( \R \) one-to-one onto \( (0, \infty) \) with inverse \( y = \ln(x) \). Hence the PDF \( f \) of \( X = e^Y \) is \[ f(x) = g(y) \frac{dy}{dx} = g\left[\ln(x)\right] \frac{1}{x} \] Substituting gives the result. Parts (a)–(d) follow from standard calculus.

In the special distribution simulator, select the lognormal distribution. Vary the parameters and note the shape and location of the probability density function. For selected values of the parameters, run the simulation 1000 times and compare the empirical density function to the true probability density function.

Let \(\Phi\) denote the standard normal distribution function. Recall that values of \(\Phi\) can be obtained from the special distribution calculator, as well as standard mathematical and statistical software packages. Thus, the following exercises show how to compute the lognormal distribution function and quantiles in terms of the standard normal distribution function and quantiles.

The lognormal distribution function \(F\) is given by \[ F(x) = \Phi \left[ \frac{\ln(x) - \mu}{\sigma} \right], \quad x \in (0, \infty) \]

Once again, write \( X = e^{\mu + \sigma Z} \) where \( Z \) has the standard normal distribution. For \( x \gt 0 \), \[ F(x) = \P(X \le x) = \P\left[Z \le \frac{\ln(x) - \mu}{\sigma}\right] = \Phi \left[ \frac{\ln(x) - \mu}{\sigma} \right] \]

The lognormal quantile function is given by \[ F^{-1}(p) = \exp\left[\mu + \sigma \Phi^{-1}(p)\right], \quad p \in (0, 1) \]

This follows by solving \( p = F(x) \) for \( x \) in terms of \( p \).

In the special distribution calculator, select the lognormal distribution. Vary the parameters and note the shape and location of the probability density function and the distribution function. With \(\mu = 0\) and \(\sigma = 1\), find the median and the first and third quartiles.

The moments of the lognormal distribution can be computed from the moment generating function of the normal distribution.

If \(X\) has the lognormal distribution with parameters \(\mu\) and \(\sigma\) then for \( t \in \R \), \[ \E\left(X^t\right) = \exp \left( \mu t + \frac{1}{2} \sigma^2 t^2 \right) \]

Recall that if \( Y \) has the normal distribution with mean \( \mu \in \R \) and standard deviation \( \sigma \in (0, \infty) \), then \( Y \) has moment generating function given by \[ \E\left(e^{t Y}\right) = \exp\left(\mu t + \frac{1}{2} \sigma^2 t^2\right), \quad t \in \R \] Hence the result follows immediately since \( \E\left(X^t\right) = \E\left(e^{t Y}\right) \).

In particular, the mean and variance of \(X\) are

- \(\E(X) = \exp\left(\mu + \frac{1}{2} \sigma^2\right)\)
- \(\var(X) = \exp\left[2 (\mu + \sigma^2)\right] - \exp\left(2 \mu + \sigma^2\right)\)

In the simulation of the special distribution simulator, select the lognormal distribution. Vary the parameters and note the shape and location of the mean\( \pm \)standard deviation bar. For selected values of the parameters, run the simulation 1000 times and compare the empirical moments to the true moments.

From the general formula for the moments, we can also compute the skewness and kurtosis of the lognormal distribution.

Suppose that \( X \) has the lognormal distribution with parameters \( \mu \in R \) and \( \sigma \in (0, \infty) \). Then

- \( \skew(X) = \left(e^{\sigma^2} + 2\right) \sqrt{e^{\sigma^2} - 1} \)
- \(\kurt(X) = e^{4 \sigma^2} + 2 e^{3 \sigma^2} + 3 e^{2 \sigma^2} - 3\)

These result follow from the first 4 moments of the lognormal distribution and the standard computational formulas for skewness and kurtosis.

The fact that the skewness and kurtosis do not depend on \( \mu \) is due to the fact that \( \mu \) is a scale parameter. Recall that skewness and kurtosis are defined in terms of the standard score and so are independent of location and scale parameters. Naturally, the lognormal distribution is positively skewed. Finally, note that the excess kurtosis is \[ \kurt(X) - 3 = e^{4 \sigma^2} + 2 e^{3 \sigma^2} + 3 e^{2 \sigma^2} - 6 \]

Even though the lognormal distribution has finite moments of all orders, the moment generating function is infinite at any positive number. This property is one of the reasons for the fame of the lognormal distribution.

\(\E\left(e^{t X}\right) = \infty\) for every \(t \gt 0\).

The most important relations are the ones between the lognormal and normal distributions in the definition: if \(X\) has a lognormal distribution then \(\ln(X)\) has a normal distribution; conversely if \(Y\) has a normal distribution then \(e^X\) has a lognormal distribution. The lognormal distribution is also a scale family.

Suppose that \( X \) has the lognormal distribution with parameters \( \mu \in \R \) and \( \sigma \in (0, \infty) \) and that \( c \in (0, \infty) \). Then \( c X \) has the lognormal distribution with parameters \( \mu + \ln(c) \) and \( \sigma \).

From the definition, we can write \( X = e^Y \) where \( Y \) has the normal distribution with mean \( \mu \) and standard deviation \( \sigma \). Hence \[ c X = c e^Y = e^{\ln(c)} e^Y = e^{\ln(c) + Y} \] But \( \ln(c) + Y \) has the normal distribution with mean \( \ln(c) + \mu \) and standard deviation \( \sigma \).

The lognormal distribution also belongs to the family of general exponential distributions.

Suppose that \( X \) has the lognormal distribution with parameters \( \mu \in \R \) and \( \sigma \in (0, \infty) \). The distribution of \( X \) is a 2-parameter exponential family with natural parameters and natural statistics, respectively, given by

- \(\left( -\frac{1}{2 \sigma^2}, \frac{\mu}{\sigma^2} \right)\)
- \(\left(\ln^2(X), \ln(X)\right)\)

This follows from the definition of the general exponential family, since we can write the PDF in the form \[ f(x) = \frac{1}{\sqrt{2 \pi} \sigma} \exp\left(-\frac{\mu^2}{2 \sigma^2}\right) \frac{1}{x} \exp\left[-\frac{1}{2 \sigma^2} \ln^2(x) + \frac{\mu}{\sigma^2} \ln(x)\right], \quad x \in (0, \infty) \]

Suppose that the income \(X\) of a randomly chosen person in a certain population (in $1000 units) has the lognormal distribution with parameters \(\mu = 2\) and \(\sigma = 1\). Find \(\P(X \gt 20)\).

\(\P(X \gt 20) = 0.1497\)

Suppose that the income \(X\) of a randomly chosen person in a certain population (in $1000 units) has the lognormal distribution with parameters \(\mu = 2\) and \(\sigma = 1\). Find each of the following:

- \(\E(X)\)
- \(\var(X)\)

- \(\E(X) = e^{5/2} \approx 12.1825\)
- \(\sd(X) = \sqrt{e^6 - e^5} \approx 15.9629\)