$$\newcommand{\P}{\mathbb{P}}$$ $$\newcommand{\E}{\mathbb{E}}$$ $$\newcommand{\R}{\mathbb{R}}$$ $$\newcommand{\N}{\mathbb{N}}$$ $$\newcommand{\bs}{\boldsymbol}$$ $$\newcommand{\var}{\text{var}}$$ $$\newcommand{\cov}{\text{cov}}$$ $$\newcommand{\cor}{\text{cor}}$$

8. Pólya's Urn Process

Basic Theory

The Model

Pólya's urn scheme is a dichotomous sampling model that generalizes the hypergeometric model (sampling without replacement) and the Bernoulli model (sampling with replacement). Pólya's urn proccess leads to a famous example of a sequence of random variables that is exchangeable, but not independent, and has deep conections with the beta-Bernoulli process. Indeed, for certain values of the parameters, the two processes are equivalent, an interesting and surprising result.

Suppose that we have an urn (what else!) that initially contains $$a$$ red and $$b$$ green balls, where $$a$$ and $$b$$ are positive integers. At each discrete time (trial), we select a ball from the urn and then return the ball to the urn along with $$c$$ new balls of the same color. Ordinarily, the parameter $$c$$ is a nonnegative integer. However, the model actually makes sense if $$c$$ is a negative integer, if we interpret this to mean that we remove the balls rather than add them, and assuming that there are enough balls of the proper color in the urn to perform this action. In any case, the random process is known as Pólya's urn process, named for George Pólya.

In terms of the colors of the selected balls, Pólya's urn scheme generalizes the standard models of sampling with and without replacement.

1. $$c = 0$$ corresponds to sampling with replacement.
2. $$c = -1$$ corresponds to sampling without replacement.

For the most part, we will assume that $$c$$ is nonnegative so that the process can be continued indefinitely. Occasionally we consider the case $$c = -1$$ so that we can interpret the results in terms of sampling without replacement.

The Outcome Variables

Let $$X_i$$ denote the color of the ball selected at time $$i$$, where 0 denotes green and 1 denotes red. Mathematically, our basic random process is the sequence of indicator variables $$\bs{X} = (X_1, X_2, \ldots)$$. As with any random process, our first goal is to compute the finite dimensional distributions of $$\bs{X}$$. That is, we want to compute the joint distribution of $$(X_1, X_2, \ldots, X_n)$$ for each $$n \in \N_+$$. Some additional notation will really help. Recall the generalized permutation formula in our study of combinatorial structures: for $$r, \, s \in \R$$ and $$j \in \N$$, we defined $r^{(s,j)} = r (r + s) (r + 2 s) \cdots [r + (j-1)s]$ Note that the expression has $$j$$ factors, starting with $$r$$, and with each factor obtained by adding $$s$$ to the previous factor. As usual, we adopt the convention that a product over an empty index set is 1. Hence $$r^{(s,0)} = 1$$ for every $$r$$ and $$s$$.

Recall that

1. $$r^{(0,j)} = r^j$$, an ordinary power
2. $$r^{(-1,j)} = r^{(j)} = r (r - 1) \cdots (r - j + 1)$$, a descending power
3. $$r^{(1,j)} = r^{[j]} = r (r + 1) \cdots (r + j - 1)$$, an ascending power
4. $$r^{(r,j)} = j! r^j$$
5. $$1^{(1,j)} = j!$$

The finite dimensional distributions are easy to compute using the multiplication rule of conditional probability. If we know the contents of the urn at any given time, then the probability of an outcome at the next time is all but trivial.

Let $$(x_1, x_2, \ldots, x_n) \in \{0,1\}^n$$ and let $$k = x_1 + x_2 + \cdots + x_n$$. Then $\P(X_1 = x_1, X_2 = x_2, \ldots, X_n = x_n) = \frac{a^{(c,k)} b^{(c, n - k)}}{(a + b)^{(c,n)}}$

Proof:

By the multiplication rule for conditional probability, $\P(X_1 = x_1, X_2 = x_2, \ldots, X_n = x_n) = \P(X_1 = x_1) \P(X_2 = x_2 \mid X_1 = x_1) \cdots \P(X_n = x_n \mid X_1 = x_1, \ldots, X_{n-1} = x_{n-1})$ Of course, if we know that the urn has, say, $$r$$ red and $$g$$ green balls at a particular time, then the probability of a red ball on the next draw is $$r / (r + g)$$ while the probability of a green ball is $$g / (r + g)$$. The right side of the displayed equation above has $$n$$ factors. The denominators are the total number of balls at the $$n$$ times, and form the product $$(a + b) (a + b + c) \ldots [a + b + (n - 1) c] = (a + b)^{(c,n)}$$. In the numerators, $$k$$ of the factors correspond to probabilities of selecting red balls; these factors form the product $$a (a + c) \cdots [a + (k - 1)c] = a^{(c,k)}$$. The remaining $$n - k$$ factors in the numerators correspond to selecting green balls; these factors form the product $$b (b + c) \cdots [b + (n - k - 1)c] = b^{(c, n-k)}$$.

The joint probability in the previous exercise depends on $$(x_1, x_2, \ldots, x_n)$$ only through the number of red balls $$k = \sum_{i=1}^n x_i$$ in the sample. Thus, the joint distribution is invariant under a permutation of $$(x_1, x_2, \ldots, x_n)$$, and hence $$\bs{X}$$ is an exchangeable sequence of random variables. This means that for each $$n$$, all permutations of $$(X_1, X_2, \ldots, X_n)$$ have the same distribution. Of course the joint distribution reduces to the formulas we have obtained earlier in the special cases of sampling with replacement ($$c = 0$$) or sampling without replacement ($$c = - 1$$), although in the latter case we must have $$n \le a + b$$.

For each $$i \in \N_+$$

1. $$\E(X_i) = \frac{a}{a + b}$$
2. $$\var(X_i) = \frac{a}{a + b} \frac{b}{a + b}$$
Proof:

Since the sequence is exchangeable, $$X_i$$ has the same distribution as $$X_1$$, so $$\P(X_i = 1) = \frac{a}{a + b}$$. The mean and variance now follow from standard results for indicator variables.

Thus $$\bs{X}$$ is a sequence of identically distributed variables, quite surprising at first but of course inevitable for any exchangeable sequence. Compare the joint and marginal distributions. Note that $$\bs{X}$$ is an independent sequence if and only if $$c = 0$$, when we have simple sampling with replacement. Pólya's urn is one of the most famous examples of a random process in which the outcome variables are exchangeable, but dependent (in general).

Next, let's compute the covariance and correlation of a pair of outcome variables.

Suppose that $$i, \, j \in \N_+$$ are distinct. Then

1. $$\cov\left(X_i, X_j\right) = \frac{a b c}{(a + b)^2 (a + b + c)}$$
2. $$\cor\left(X_i, X_j\right) = \frac{c}{a + b + c}$$
Proof:

Since the variables are exchangeable, $$\P(X_i = 1, X_j = 1) = \P(X_1 = 1, X_2 = 1) = \frac{a}{a + b} \frac{a + c}{a + b + c}$$. The results now follow from standard formulas for covariance and correlation.

Thus, the variables are positively correlated if $$c \gt 0$$, negatively correlated if $$c \lt 0$$, and uncorrelated (in fact, independent), if $$c = 0$$. These results certainly make sense when we recall the dynamics of Pólya's urn. It turns out that in any infinite sequence of exchangeable variables, the variables must be nonnegatively correlated.

Pólya's urn is described by a sequence of indicator variables. We can study the same derived random processes that we studied with Bernoulli trials: the number of red balls in the first $$n$$ trials, the trial number of the $$k$$th red ball, and so forth.

The Number of Red Balls

For $$n \in \N$$, the number of red balls selected in the first $$n$$ trials is $Y_n = \sum_{i=1}^n X_i$ so that $$\bs{Y} = (Y_0, Y_1, \ldots)$$ is the partial sum process associated with $$\bs{X} = (X_1, X_2, \ldots)$$.

Note that

1. The number of green balls selected in the first $$n$$ trials is $$n - Y_n$$.
2. The number of red balls in the urn after the first $$n$$ trials is $$a + c \, Y_n$$.
3. The number of green balls in the urn after the first $$n$$ trials is $$b + c (n - Y_n)$$.
4. The number of balls in the urn after the first $$n$$ trials is $$a + b + c \, n$$.

The basic analysis of $$\bs{Y}$$ follows easily from our work with $$\bs{X}$$.

The probability density function of $$Y_n$$ is given by $\P(Y_n = k) = \binom{n}{k} \frac{a^{(c,k)} b^{(c, n-k)}}{(a + b)^{(c,n)}}, \quad k \in \{0, 1, \ldots, n\}$

Proof:

$$\P(Y_n = y)$$ is the sum of $$\P(X_1 = x_1, X_2 = x_2, \ldots, X_n = x_n)$$ over all $$(x_1, x_2, \ldots, x_n) \in \{0, 1\}^n$$ with $$\sum_{i=1}^n x_i = k$$. There are $$\binom{n}{k}$$ such sequences, and each has the probability given above.

The distribution defined by this probability density function is known, appropriately enough, as the Pólya distribution. Of course, the distribution reduces to the binomial distribution in the case of sampling with replacement ($$c = 0$$) and to the hypergeometric distribution in the case of sampling without replacement ($$c = - 1$$), although again in this case we need $$n \le a + b$$. The case where all three parameters are equal is particularly interesting.

If $$a = b = c$$ then $$Y_n$$ is uniformly distributed on $$\{0, 1, \ldots, n\}$$.

Proof:

For the general PDF given above, $$a^{(a, k)} = k! a^k$$, and $$a^{(a, n - k)} = (n - k)! a^{n-k}$$. Hence the numerator becomes $$k! (n - k)! a^n$$. In the denominator we have $$(2 a)^{(a,n)} = (2 a)(2 a + a) + \cdots + [2 a + (n - 1)a] = (2 a) (3 a) \cdots (n + 1) a = (n + 1)! a^n$$. Substituting we have $\P(Y_n = k) = \frac{n!}{k!(n - k)!} \frac{k! (n - k)!}{(n + 1)!} = \frac{1}{n + 1}, \quad k \in \{0, 1, \ldots, n\}$

In general, the Pólya family of distributions has a diverse collection of shapes.

Start the simulation of the Pólya Urn Experiment. Vary the parameters and note the shape of the probability density function. In particular, note when the function is skewed, when the function is symmetric, when the function is unimodal, when the function is monotone, and when the function is U-shaped. For various values of the parameters, run the simulation 1000 times and compare the empirical density function to the probability density function.

The Pólya probability density function is

1. unimodal if $$a \gt b \gt c$$ and $$n \gt \frac{a - c}{b - c}$$
2. unimodal if $$b \gt a \gt c$$ and $$n \gt \frac{b - c}{a - c}$$
3. U-shaped if $$c \gt a \gt b$$ and $$n \gt \frac{c - b}{c - a}$$
4. U-shaped if $$c \gt b \gt a$$ and $$n \gt \frac{c - a}{c - b}$$
5. increasing if $$b \lt c \lt a$$
6. decreasing if $$a \lt c \lt b$$
Proof:

These results follow from solving the inequality $$\P(Y_n = k) \gt \P(Y_n = k - 1)$$.

Next, let's find the mean and variance. Curiously, the mean does not depend on the parameter $$c$$.

The mean and variance of the number of red balls selected are

1. $$\E(Y_n) = n \frac{a}{a + b}$$
2. $$\var(Y_n) = n \frac{a b}{(a + b)^2} \left[1 + (n - 1) \frac{c}{a + b + c}\right]$$
Proof:

These results follow from the mean and covariance of the indicator variables given above, and basic properties of expected value and variance.

1. $$\E(Y_n) = \sum_{i=1}^n \E(X_i)$$
2. $$\var(Y_n) = \sum_{i=1}^n \sum_{j=1}^n \cov\left(X_i, X_j\right)$$

Start the simulation of the Pólya Urn Experiment. Vary the parameters and note the shape and location of the mean $$\pm$$ standard deviation bar. For various values of the parameters, run the simulation 1000 times and compare the empirical mean and standard deviation to the distribution mean and standard deviation.

Explicitly compute the probability density function, mean, and variance of $$Y_5$$ when $$a = 6$$, $$b = 4$$, and for the values of $$c \in \{-1, 0, 1, 2, 3, 10\}$$. Sketch the graph of the density function in each case.

Fix $$a$$, $$b$$, and $$n$$, and let $$c \to \infty$$. Then

1. $$\P(Y_n = 0) \to \frac{b}{a + b}$$
2. $$\P(Y_n = n) \to \frac{a}{a + b}$$
3. $$\P\left(Y_n \in \{1, 2, \ldots, n - 1\}\right) \to 0$$
Proof:

Note that $$\P(Y_n = 0) = \frac{b^{(c, n)}}{(a + b)^{(c, n)}}$$. The numerator and denominator each have $$n$$ factors. If these factors are grouped into a product of $$n$$ fractions, then the first is $$\frac{b}{a + b}$$. The rest have the form $$\frac{a + j \, c}{a + b + j \, c}$$ where $$j \in \{1, 2, \ldots n - 1\}$$ Each of these converges to 1 as $$c \to \infty$$. Part (b) follows by a similar argument. Part (c) follows from (a) and (b) and the complement rule.

Thus, the limiting distribution of $$Y_n$$ is concentrated on 0 and $$n$$. The limiting probabilities are just the initial proportion of green and red balls, respectively. Interpret this result in terms of the dynamics of Pólya's urn scheme.

The Proportion of Red Balls

Suppose that $$c \in \N$$, so that the process continues indefinitely. For $$n \in \N$$, the proportion of red balls selected in the first $$n$$ trials is $M_n = \frac{Y_n}{n} = \frac{1}{n} \sum_{i=1}^n X_i$ This is an interesting variable, since a little reflection suggests that it may have a limit as $$n$$ increases. Indeed, if $$c = 0$$, then $$M_n$$ is just the sample mean corresponding to $$n$$ Bernoulli trials. Thus, by the law of large numbers, $$M_n$$ converges to the success parameter $$\frac{a}{a + b}$$ as $$n \to \infty$$ with probability 1. On the other hand, the proportion of red balls in the urn after $$n$$ trials is $Z_n = \frac{a + c Y_n}{a + b + c n}$ When $$c = 0$$, of course, $$Z_n = \frac{a}{a + b}$$ so that in this case, $$Z_n$$ and $$M_n$$ have the same limiting behavior. This turns out to be true more generally

For $$c \in \N_+$$, $$M_n$$ has a limit if and only if $$Z_n$$ has a limit, and in this case, the limits are the same,

1. if the limits are with probability 1.
2. if the limits are in distribution.
Proof:

This follows since $$Z_n = \frac{a}{a + b + c n} + \frac{c n}{a + b + c n} M_n$$. The constant term converges to 0 as $$n \to \infty$$ and the coefficient of $$M_n$$ converges to 1 as $$n \to \infty$$.

If $$a = b = c$$ then the distribution of $$M_n$$ converges to the uniform distribution on the interval $$(0, 1)$$ as $$n \to \infty$$.

Proof:

Recall from above that $$Y_n$$ has the uniform distribution on $$\{0, 1, \ldots n\}$$ if $$a = b = c$$. It then follows from an example in the section on convergence in distribution that $$Y_n / n$$ converges to the uniform distribution in $$(0, 1)$$.

More generally, it turns out that when $$c \in \N_+$$, $$M_n$$ and $$Z_n$$ converge with probability 1 to a random variable $$U$$ that has the beta distribution with left parameter $$a / c$$ and right parameter $$b / c$$. We need the theory of martingales to derive and understand this result.

The Trial Number of the $$k$$th Red Ball

Suppose again that $$c \in \N$$, so that the process continues indefinitely. For $$k \in \N_+$$ let $$V_k$$ denote the trial number of the $$k$$th red ball selected. Thus $V_k = \min\{n \in \N_+: Y_n = k\}$ Note that $$V_k$$ takes values in $$\{k, k + 1, \ldots\}$$. The random processes $$\bs{V} = (V_1, V_2, \ldots)$$ and $$\bs{Y} = (Y_1, Y_2, \ldots)$$ are inverses of each other in a sense.

For $$k, \, n \in \N_+$$ with $$k \le n$$,

1. $$V_k \le n$$ if and only if $$Y_n \ge k$$
2. $$V_k = n$$ if and only if $$Y_{n-1} = k - 1$$ and $$X_n = 1$$

Suppose that $$n \in \N_+$$ and $$k \in \{0, 1, \ldots, n - 1\}$$. Then $\P(X_n = 1 \mid Y_{n-1} = k) = \frac{a + k\,c}{a + b + (n-1) c}$ In particular, if $$a = b = c$$ then $$\P(X_n = 1 \mid Y_{n-1} = k) = \frac{k + 1}{n + 1}$$.

The last probability satisfies Laplace's rule of succession, another interesting connection. The rule is named for Pierre Simon Laplace, and is studied from a different point of view in the section on Independence.

The probability denisty function of $$V_k$$ is given by $\P(V_k = n) = \binom{n - 1}{k - 1} \frac{a^{(c,k)} b^{(c,n-k)}}{(a + b)^{(c,n)}}, \quad n \in \{k, k + 1, \ldots\}$

Proof:

This follows from thePDF of $$Y_n$$ given above, the inverse relation, the previous result on conditional probability, and the multiplication rule for probability.

Of course this probability density function reduces to the negative binomial density function with trial parameter $$k$$ and success parameter $$p = \frac{a}{a+b}$$ when $$c = 0$$ (sampling with replacement).

If $$a = b = c$$ then $\P(V_k = n) = \frac{k}{n (n + 1)}, \quad n \in \{k, k + 1, k + 2, \ldots\}$

Proof:

As in the corresponding proof for the number of red balls, the fraction in the PDF of $$V_k$$ in the previous result reduces to $$\frac{k! (n - k)!}{(n + 1)!}$$, while the binomial coefficient is $$\frac{(n - 1)!}{(k - 1)! (n - k)!}$$.

Fix $$a$$, $$b$$, and $$k$$, and let $$c \to \infty$$. Then

1. $$\P(V_k = k) \to \frac{a}{a + b}$$
2. $$\P(V_k \in \{k + 1, k + 2, \ldots\}) \to 0$$

Thus, the limiting distribution of $$V_k$$ is concentrated on $$k$$ and $$\infty$$. The limiting probabilities at these two points are just the initial proportion of red and green balls, respectively. Interpret this result in terms of the dynamics of Pólya's urn scheme.