As usual, suppose that we have a random experiment with sample space \(S\) and probability measure \(\P\). In this section, we will discuss independence, one of the fundamental concepts in probability theory. Independence is frequently invoked as a modeling assumption, and moreover, (classical) probability itself is based on the idea of independent replications of the experiment.
We will define independence on increasingly complex structures, from two events, to collections of events, and then to collections of random variables. In each case, the basic idea is the same.
Two events \(A\) and \(B\) are independent if
\[\P(A \cap B) = \P(A) \P(B)\]If both of the events have positive probability, then independence is equivalent to the statement that the conditional probability of one event given the other is the same as the unconditional probability of the event:
\[\P(A \mid B) = \P(A) \iff \P(B \mid A) = \P(B) \iff \P(A \cap B) = \P(A) \P(B)\]This is how you should think of independence: knowledge that one event has occurred does not change the probability assigned to the other event.
The terms independent and disjoint sound vaguely similar but they are actually very different. First, note that disjointness is purely a set-theory concept while independence is a probability (measure-theoretic) concept. Indeed, two events can be independent relative to one probability measure and dependent relative to another. But most importantly, two disjoint events can never be independent, except in the trivial case that one of the events is null.
Suppose that \(A\) and \(B\) are disjoint events for an experiment, each with positive probability. Then \(A\) and \(B\) are dependent, and in fact are negatively correlated.
Note that \(\P(A \cap B) = 0\) but \(\P(A) \P(B) \gt 0\).
If \(A\) and \(B\) are independent events in an experiment, it seems clear that any event that can be constructed from \(A\) should be independent of any event that can be constructed from \(B\). This is the case, as the next exercise shows. Moreover, this basic idea is essential for the generalization of independence that we will consider shortly.
If \(A\) and \(B\) are independent events in an experiment, then each of the following pairs of events is independent:
Suppose that \( A \) and \( B \) are independent. Then by the difference rule,
\[ \P(A^c \cap B) = \P(B) - \P(A \cap B) = \P(B) - \P(A) \, \P(B) = \P(B)[1 - \P(A)] = \P(B) \P(A^c) \]Hence \( A^c \) and \( B \) are equivalent. Parts (b) and (c) are logically equivalent to (a).
An event that is essentially deterministic
, that is, has probability 0 or 1, is independent of any other event, even itself.
Suppose that \(A\) and \(B\) are events in a random experiment.
For part (a), recall that if \( \P(A) = 0 \) then \( \P(A \cap B) = 0 \), and if \( \P(A) = 1 \) then \( \P(A \cap B) = \P(B) \). In either case we have \( \P(A \cap B) = \P(A) \P(B) \). For part (b), the independence of \( A \) with itself gives \( \P(A) = [\P(A)]^2 \) and hence either \( \P(A) = 0 \) or \( \P(A) = 1 \).
To extend the definition of independence to more than two events, we might think that we could just require pairwise independence, the independence of each pair of events. However, this is not sufficient for the strong type of independence that we have in mind. For example, suppose that we have three events \(A\), \(B\), and \(C\). Mutual independence of these events should not only mean that each pair is independent, but also that an event that can be constructed from \(A\) and \(B\) (for example \(A \cup B^c\)) should be independent of \(C\). Pairwise independence does not achieve this; Exercise 25 gives an example of three events that are pairwise independent, but the intersection of two of the events is related to the third event in the strongest possible sense.
Another possible generalization would be to simply require the probability of the intersection of the events to be the product of the probabilities of the events. However, this condition does not even guarantee pariwise independence. Exercise 27 gives an example.
However, the definition of independence for two events does generalize in a natural way to an arbitrary collection of events. Specifically, suppose that \( A_i \) is an event for each \( i \) in an index set \( I \). Then the collection \( \mathscr{A} = \{A_i: i \in I\} \) is independent if for every finite \( J \subseteq I \),
\[\P\left(\bigcap_{j \in J} A_j \right) = \prod_{j \in J} \P(A_j)\]Independence of a collection of events is much stronger than mere pairwise independence of the events in the collection. The basic inheritance property in the following exercise is essentially equivalent to the definition.
Suppose that \(\mathscr{A}\) is a collection of events
There are \(2^n - n - 1\) non-trivial conditions in the definition of the independence of \(n\) events.
If the events \( A_1, A_2, \ldots, A_n \) are independent, then it follows immediately from the definition that
\[\P\left(\bigcap_{i=1}^n A_i\right) = \prod_{i=1}^n \P(A_i)\]This is known as the multiplication rule for independent events. Compare this with the general multiplication rule for conditional probability.
The collection of essentially deterministic events \(\mathscr{D} = \{A \in \mathscr{S}: \P(A) = 0 \text{ or } \P(A) = 1\}\) is independent.
Suppose that \( \{A_1, A_2, \ldots, A_n\} \subseteq \mathscr{D} \). If \( \P(A_i) = 0 \) for some \( i \in \{1, 2, \ldots, n\} \) then \( \P(A_1 \cap A_2 \cap \cdots \cap A_n) = 0 \). If \( \P(A_i) = 1 \) for every \( i \in \{1, 2, \ldots, n\} \) then \( \P(A_1 \cap A_2 \cap \cdots \cap A_n) = 1 \). In either case, \( \P(A_1 \cap A_2 \cdots \cap A_n) = \P(A_1) \P(A_2) \cdots \P(A_n) \).
The following three exercises give examples of the type of strong independence that is guaranteed by our definition. Compare these exercises with Exercise 2.
If \(A\), \(B\), and \(C\) are independent events in an experiment, then
If \(A\), \(B\), \(C\), and \(D\) are independent events in an experiment, then
If \( \{A_i: i \in I\} \) is an independent collection of events, then \( \{A_i^c: i \in I\} \) is also independent.
Since the independence of a collection is defined in terms of finite subcollections, it suffices to show that if \( A_1, A_2, \ldots, A_n \) are independent then \( A_1^c, A_2^c, \ldots, A_n^c \) are independent. From the difference rule and the independencen of the events,
\[ \begin{align} \P(A_1^c \cap A_2 \cap \cdots \cap A_n) & = \P(A_2 \cap \cdots \cap A_n) - \P(A_1 \cap A_2 \cap \cdots \cap A_n) \\ & = \P(A_2) \cdots \P(A_n) - \P(A_1) \P(A_2) \cdots \P(A_n) \\ & = [1 - \P(A_1)] \P(A_2) \cdots \P(A_n) = \P(A_1^c) \P(A_2) \cdots \P(A_n) \end{align} \]It follows that \( A_1^c, A_2, \ldots, A_n \) are independent. Repeating the argument \( n - 1 \) more times shows that \( A_1^c, A_2^c, \ldots, A_n^c \) are independent.
The complete generalization of these results is a bit complicated, but roughly means that if we start with a collection of indpendent events, and form new events from disjoint subcollections (using the set operations of union, intersection, and complment), then the new events are independent. For a precise statement, see the section on measure theory. This following formula for the probability of the union of a collection of independent events that is much nicer than the inclusion-exclusion formula.
If \( A_1, A_2, \ldots, A_n \) are independent events, then
\[\P\left(\bigcup_{i=1}^n A_i\right) = 1 - \prod_{i=1}^n [1 - \P(A_i)]\]From DeMorgan's law and the independence of \( A_1^c, A_2^c, \ldots, A_n^c \) we have
\[ \P\left(\bigcup_{i=1}^n A_i \right) = 1 - \P\left( \bigcap_{i=1}^n A_i^c \right) = 1 - \prod_{i=1}^n \P(A_i^c) = 1 - \prod_{i=1}^n [1 - \P(A_i)] \]Suppose now that \(X_i\) is a random variable taking values in a set \(T_i\) for each \(i\) in a nonempty index set \(I\). Intuitively, the random variables are independent if knowledge of the values of some of the variables tells us nothing about the values of the other variables. Mathematically, independence of a collection of random variables can be reduced to the independence of collections of events. Formally, \( \mathscr{X} = \{X_i: i \in I\} \) is independent if the collection of events \( \{\{X_i \in B_i\}: i \in I\} \) is independent for every choice of \( B_i \subseteq T_i \) for \( i \in I \). Equivalently then, \( \mathscr{X} \) is independent if for every finite \(J \subseteq I\), and for every choice of \(B_j \subseteq T_j\) for \(j \in J\) we have
\[\P\left(\bigcap_{j \in J} \{X_j \in B_j\} \right) = \prod_{j \in J} \P(X_j \in B_j)\]Suppose that \(\mathscr{X}\) is a collection of random variables.
Suppose again that \( X_i \) is a random variable taking values in \( T_i \) for each \( i \) in an index set \( I \), and suppose that \(g_i\) is a function from \(T_i\) into a set \(U_i\) for each \( i \in I \). If \( \{X_i: i \in I\} \) is independent, then \( \{g_i(X_i): i \in I\} \) is also independent.
Suppose that \( C_i \subseteq U_i\) for each \( i \in I \). Then \( \{g_i(X_i) \in C_i\} = \{X_i \in g_i^{-1}(C_i)\} \). The events \( \{X_i \in g_i^{-1}(C_i)\}, i \in I \) are independent.
In the discussion above, the spaces \( T_i \) and \( U_i \) are technically required to be measure spaces, and the various subsets and functions are then required to be measurable. If you are a new student of probability, you can ignore all of this. If you are interested, see the section on measture theory.
As with events, the (mutual) independence of random variables is a very strong property. If a collection of random variables is independent, then by Exercise 10, any subcollection is also independent. By Exercise 11, new random variables formed from disjoint subcollections are independent. For a simple example, suppose that \(X\), \(Y\), and \(Z\) are independent real-valued random variables. Then
In particular, note that statement 2 in the list above is much stronger than the conjunction of statements 4 and 5. Contrapositively, if \(X\) and \(Z\) are dependent, then \((X, Y)\) and \(Z\) are also dependent.
Independence of random variables subsumes independence of events. A collection of events \(\mathscr{A}\) is independent if and only if the corresponding collection of indicator variables \(\{\bs{1}_A: A \in \mathscr{A}\}\) is independent.
Many of the concepts that we have been using informally can now be made precise. A compound experiment that consists of independent stages
is essentially just an experiment whose outcome is a sequence of independent random variables \(\bs{X} = (X_1, X_2, \ldots)\) where \(X_i\) is the outcome of the \(i\)th stage.
In particular, suppose that we have a basic experiment with outcome variable \(X\). By definition, the outcome of the experiment that consists of independent replications
of the basic experiment is a sequence of independent random variables \(\bs{X} = (X_1, X_2, \ldots)\) each with the same probability distribution as \(X\). This is fundamental to the very concept of probability, as expressed in the law of large numbers. From a statistical point of view, suppose that we have a population of objects and a vector of measurements \(X\) of interest for the objects in the sample. The sequence \(\bs{X}\) above corresponds to sampling from the distribution of \(X\); that is, \(X_i\) is the vector of measurements for the \(i\)th object drawn from the sample. When we sample from a finite population, sampling with replacement generates independent random variables while sampling without replacement generates dependent random variables.
As noted at the beginning of our discussion, independence of events or random variables depends on the underlying probability measure. Thus, suppose that \(B\) is an event in a random experiment with positive probability. A collection of events or a collection of random variables is conditionally independent given \(B\) if the collection is independent relative to the conditional probability measure \(A \mapsto \P(A \mid B)\). For example, suppose that \( A_i \) is an event for each \( i \) in an index set \( I \). Then \( \{A_i: i \in I\} \) is conditionally independent given \( B \) if for every finite \( J \subseteq I \),
\[\P\left(\bigcap_{j \in J} A_j \biggm| B \right) = \prod_{j \in J} \P(A_j \mid B)\]Note that the definitions and theorems of this section would still be true, but with all probabilities conditioned on \(B\).
Conversely, conditional probability has a nice interpretation in terms of independent replications of the experiment. Thus, suppose that we start with a basic experiment that has sample space \(S\). We let \(X\) denote the outcome random variable, so that mathematically \(X\) is simply the identity function on \(S\). In particular, if \(A\) is an event then trivially, \(\P(X \in A) = \P(A)\).
Suppose now that we replicate the experiment independently. This results in a new, compound experiment with a sequence of independent random variables \((X_1, X_2, \ldots)\), each with the same distribution as \(X\). The new sample space is \( S^\infty \). Suppose now that \(A\) and \(B\) are events in the basic experiment (that is, subsets of \(S\)) with \(\P(B) \gt 0\).
In the compound experiment, the event that when \(B\) occurs for the first time, \(A\) also occurs
is
The probability of the event in the last exercise is
\[\frac{\P(A \cap B)}{\P(B)} = \P(A \mid B)\]The events in the union are disjoint. Also, since \( (X_1, X_2, \ldots) \) is a sequence of independent variables, each with the distribution of \( X \) we have
\[ \P(X_1 \notin B, X_2 \notin B, \ldots, X_{n-1} \notin B, X_n \in A \cap B) = [\P(B^c)]^{n-1} \P(A \cap B) = [1 - \P(B)]^{n-1} \P(A \cap B) \]Hence, using geometric series, the probability of the event in Exercise 13 is
\[ \sum_{n=1}^\infty [1 - \P(B)]^{n-1} \P(A \cap B) = \frac{\P(A \cap B)}{1 - [1 - \P(B)]} = \frac{\P(A \cap B)}{\P(B)} \]The result in the last exercise can be obtained directly. Specifically, suppose that we create a new experiment by repeating the basic experiment until \(B\) occurs for the first time, and then record the outcome of just the last repetition of the basic experiment. The appropriate probability measure on the new experiment is \(A \mapsto \P(A \mid B)\).
Suppose that \(A\) and \(B\) are disjoint events in a basic experiment with \(\P(A) \gt 0\) and \(\P(B) \gt 0\). In the compound experiment obtained by replicating the basic experiment, the event that \(A\) occurs before \(B\)
has probability
Note that the event \( A \) occurs before \( B \)
is the same as the event when \( A \cup B \) occurs for the first time, \( A \) occurs
.
Suppose that \(A\), \(B\), and \(C\) are independent events in an experiment with \(\P(A) = 0.3\), \(\P(B) = 0.5\), and \(\P(C) = 0.8\). Express each of the following events in set notation and find its probability:
Suppose that \(A\), \(B\), and \(C\) are independent events for an experiment with \(\P(A) = \frac{1}{2}\), \(\P(B) = \frac{1}{3}\), and \(\P(C) = \frac{1}{4}\). Find the probability of each of the following events:
A small company has 100 employees; 40 are men and 60 are women. There are 6 male executives. How many female executives should there be if gender and rank are independent? The underlying experiment is to choose an employee at random.
Suppose that a farm has four orchards that produce peaches, and that peaches are classified by size as small, medium, and large. The table below gives total number of peaches in a recent harvest by orchard and by size. Fill in the body of the table with counts for the various intersections, so that orchard and size are independent variables. The underlying experiment is to select a peach at random from the farm.
| orchard/size | small | medium | large | total |
|---|---|---|---|---|
| 1 | 400 | |||
| 2 | 600 | |||
| 3 | 300 | |||
| 4 | 700 | |||
| total | 400 | 1000 | 600 | 2000 |
Note from the last two exercises that you cannot see
independence in a Venn diagram. Again, independence is a measure-theoretic concept, not a set-theoretic concept.
A Bernoulli trials sequence is a sequence \(\bs{X} = (X_1, X_2, \ldots)\) of independent, identically distributed indicator variables. Random variable \(X_i\) is the outcome of trial \(i\), where in the usual terminology of reliability theory, 1 denotes success and 0 denotes failure. The canonical example is the sequence of scores when a coin (not necessarily fair) is tossed repeatedly. Another basic example arises whenever we start with an basic experiment and an event \(A\) of interest, and then repeat the experiment. In this setting, \(X_i\) is the indicator variable for event \(A\) on the \(i\)th run of the experiment. The Bernoulli trials process is named for Jacob Bernoulli, and has a single basic parameter \(p = \P(X_i = 1)\). This random process is studied in detail in the chapter on Bernoulli trials.
For \((x_1, x_2, \ldots, x_n) \in \{0, 1\}^n\),
\[\P(X_1 = x_1, X_2 = x_2, \ldots, X_n = x_n) = p^{x_1 + x_2 + \cdots + x_n} (1 - p)^{n - (x_1 + x_2 + \cdots + x_n)} \]Note that the sequence of indicator random variables \(\bs{X}\) is exchangeable. That is, if the sequence \((x_1, x_2, \ldots, x_n)\) in Exercise 21 is permuted, the probability does not change. On the other hand, there are exchangeable sequences of indicator random variables that are dependent, as Pólya's urn model so dramatically illustrates.
Let \(Y\) denote the number of successes in the first \(n\) trials. Then
\[\P(Y = y) = \binom{n}{y} p^y (1 - p)^{n-y}, \quad y \in \{0, 1, \ldots, n\}\]The distribution of \(Y\) is called the binomial distribution with parameters \(n\) and \(p\). The binomial distribution is studied in more detail in the chapter on Bernoulli Trials.
More generally, a multinomial trials sequence is a sequence \(\bs{X} = (X_1, X_2, \ldots)\) of independent, identically distributed random variables, each taking values in a finite set \(S\). The canonical example is the sequence of scores when a \(k\)-sided die (not necessarily fair) is thrown repeatedly. Multinomial trials are also studied in detail in the chapter on Bernoulli trials.
Consider the experiment that consists of dealing 2 cards from a standard deck and recording the sequence of cards dealt. For \(i \in \{1, 2\}\), let \(Q_i\) be the event that card \(i\) is a queen and \(H_i\) the event that card \(i\) is a heart. Compute the appropriate probabilities to verify the following results. Reflect on these results.
In the card experiment, set \(n = 2\). Run the simulation 500 times. For each pair of events in the previous exercise, compute the product of the empirical probabilities and the empirical probability of the intersection. Compare the results.
The following exercise gives three events that are pairwise independent, but not (mutually) independent.
Consider the dice experiment that consists of rolling 2 standard, fair dice and recording the sequence of scores. Let \(A\) denote the event that first score is 3, \(B\) the event that the second score is 4, and \(C\) the event that the sum of the scores is 7. Then
In the dice experiment, set \(n = 2\). Run the experiment 500 times. For each pair of events in the previous exercise, compute the product of the empirical probabilities and the empirical probability of the intersection. Compare the results.
The following exercise gives an example of three events with the property that the probability of the intersection is the product of the probabilities, but the events are not pairwise independent.
Suppose that we throw a standard, fair die one time. Let \(A = \{1, 2, 3, 4\}\), \(B = C = \{4, 5, 6\}\). Then
Suppose that a standard, fair die is thrown 4 times. Find the probability of the following events.
Suppose that a pair of standard, fair dice are thrown 8 times. Find the probability of each of the following events.
Consider the dice experiment that consists of rolling \(n\), \(k\)-sided dice and recording the sequence of scores \(\bs{X} = (X_1, X_2, \ldots, X_n)\).The following conditions are equivalent (and correspond to the assumption that the dice are fair):
A pair of standard, fair dice are thrown repeatedly. Find the probability of each of the following events.
the hard wayas \((4, 4)\).
Problems of the type in the last exercise are important in the game of craps. Craps is studied in more detail in the chapter on Games of Chance.
A biased coin with probability of heads \(\frac{1}{3}\) is tossed 5 times. Let \(\bs{X}\) denote the outcome of the tosses (encoded as a bit string) and let \(Y\) denote the number of heads. Find each of the following:
A box contains a fair coin and a two-headed coin. A coin is chosen at random from the box and tossed repeatedly. Let \(F\) denote the event that the fair coin is chosen, and let \(X_i\) denote the outcome of the \(i\)th toss (where 1 encodes heads and 0 encodes tails). Then
Consider again the box in the previous exercise, but we change the experiment as follows: a coin is chosen at random from the box and tossed and the result recorded. The coin is returned to the box and the process is repeated. As before, let \(X_i\) denote the outcome of toss \(i\). Then \((X_1, X_2, \ldots)\) is a Bernoulli trials sequence with parameter \(p = \frac{3}{4}\). Specifically,
Think carefully about the results in the previous two exercises, and the differences between the two models. Tossing a coin produces independent random variables if the probability of heads is fixed (that is, non-random even if unknown). Tossing a coin with a random probability of heads generally does not produce independent random variables; the result of a toss gives information about the probability of heads which in turn gives information about subsequent tosses.
Recall that Buffon's coin experiment consists of tossing a coin with radius \(r \le \frac{1}{2}\) randomly on a floor covered with square tiles of side length 1. The coordinates \((X, Y)\) of the center of the coin are recorded relative to axes through the center of the square in which the coin lands. The following conditions are equivalent:
In Buffon's coin experiment, set \(r = 0.3\). Run the simulation 500 times. For the events \(\{X \gt 0\}\) and \(\{Y \lt 0\}\), compute the product of the empirical probabilities and the empirical probability of the intersection. Compare the results.
The arrival time \(X\) of the \(A\) train is uniformly distributed on the interval \((0, 30)\), while the arrival time \(Y\) of the \(B\) train is uniformly distributed on the interval \((15, 30)\). (The arrival times are in minutes, after 8:00 AM). Moreover, the arrival times are independent. Find the probability of each of the following events:
Recall the simple model of structural reliability in which a system is composed of \(n\) components. Suppose in addition that the components operate independently of each other. As before, let \(X_i\) denote the state of component \(i\), where 1 means working and 0 means failure. Thus, our basic assumption is that the state vector \(\bs{X} = (X_1, X_2, \ldots, X_n)\) is a sequence of independent indicator random variables. We assume that the state of the system (either working or failed) depends only on the states of the components, according to a structure function. Thus, the state of the system is an indicator random variable
\[Y = Y(X_1, X_2, \ldots, X_n)\]Generally, the probability that a device is working is the reliability of the device. Thus, we will denote the reliability of component \(i\) by \(p_i = \P(X_i = 1)\) so that the vector of component reliabilities is \(\bs{p} = (p_1, p_2, \ldots, p_n)\). By independence, the system reliability \(r\) is a function of the component reliabilities:
\[r(p_1, p_2, \ldots, p_n) = \P(Y = 1)\]Appropriately enough, this function is known as the reliability function. Our challenge is usually to find the reliability function, given the structure function. When the components all have the same probability \(p\) then of course the system reliability \(r\) is just a function of \(p\). In this case, the state vector \(\bs{X} = (X_1, X_2, \ldots, X_n)\) forms a sequence of Bernoulli trials.
Comment on the independence assumption for real systems, such as your car or your computer.
Recall that a series system is working if and only if each component is working.
Recall that a parallel system is working if and only if at least one component is working.
Recall that a \(k\) out of \(n\) system is working if and only if at least \(k\) of the \(n\) components are working. Thus, a parallel system is a 1 out of \(n\) system and a series system is an \(n\) out of \(n\) system. A \(k\) out of \(2 k - 1\) system is a majority rules system. The reliability function of a general \(k\) out of \(n\) system is a mess. However, if the component reliabilities are the same, the function has a reasonably simple form.
For a \(k\) out of \(n\) system with common component reliability \(p\), the system reliability is
\[r(p) = \sum_{i = k}^n \binom{n}{i} p^i (1 - p)^{n - i}\]Consider a system of 3 independent components with common reliability \(p = 0.8\). Find the reliability of each of the following:
Consider a system of 3 independent components with reliabilities \(p_1 = 0.8\), \(p_2 = 0.8\), \(p_3 = 0.7\). Find the reliability of each of the following:
Consider an airplane with an odd number of engines, each with reliability \(p\). Suppose that the airplane is a majority rules system, so that the airplane needs a majority of working engines in order to fly.
The graph below is known as the Wheatstone bridge network and is named for Charles Wheatstone. The edges represent components, and the system works if and only if there is a working path from vertex \(a\) to vertex \(b\).
A system consists of 3 components, connected in parallel. Because of environmental factors, the components do not operate independently, so our usual assumption does not hold. However, we will asume that under low stress conditions, the components are independent, each with reliability 0.9; under medium stress conditions, the components are independent with reliability 0.8; and under high stress conditions, the components are independent, each with reliability 0.7. The probability of low stress is 0.5, of medium stress is 0.3, and of high stress is 0.2.
Recall the discussion of diagnostic testing in the section on Conditional Probability. Thus, we have an event \(A\) for a random experiment whose occurrence or non-occurrence we cannot observe directly. Suppose now that we have \(n\) tests for the occurrence of \(A\), labeled from 1 to \(n\). We will let \(T_i\) denote the event that test \(i\) is positive for \(A\). The tests are independent in the following sense:
Note that unconditionally, it is not reasonable to assume that the tests are independent. For example, a positive result for a given test presumably is evidence that the condition \(A\) has occurred, which in turn is evidence that a subsequent test will be positive. In short, we expect that \(T_i\) and \(T_j\) should be positively correlated.
We can form a new, compound test by giving a decision rule in terms of the individual test results. In other words, the event \(T\) that the compound test is positive for \(A\) is a function of \((T_1, T_2, \ldots, T_n)\). The typical decision rules are very similar to the reliability structures discussed above. A special case of interest is when the \(n\) tests are independent applications of a given basic test. In this case, \(a_i = a\) and \(b_i = b\) for each \(i\).
Consider the compound test that is positive for \(A\) if and only if each of the \(n\) tests is positive for \(A\).
Consider the compound test that is positive for \(A\) if and only if each at least one of the \(n\) tests is positive for \(A\).
More generally, we could define the compound \(k\) out of \(n\) test that is positive for \(A\) if and only if at least \(k\) of the individual tests are positive for \(A\). The test in Exercise 47 is the \(n\) out of \(n\) test, while the test in Exercise 48 is the 1 out of \(n\) test. The \(k\) out of \(2 k - 1\) test is the majority rules test.
Suppose that a woman initially believes that there is an even chance that she is pregnant or not pregnant. She buys three identical pregnancy tests with sensitivity 0.95 and specificity 0.90. Tests 1 and 3 are positive and test 2 is negative.
Suppose that 3 independent, identical tests for an event \(A\) are applied, each with sensitivity \(a\) and specificity \(b\). Find the sensitivity and specificity of the following tests:
In a criminal trial, the defendant is convicted if and only if all 6 jurors vote guilty. Assume that if the defendant really is guilty, the jurors vote guilty, independently, with probability 0.95, while if the defendant is really innocent, the jurors vote not guilty, independently with probability 0.8. Suppose that 70% of defendants brought to trial are guilty.
Recall our discussion of genetics in the section on Probability Measure and our discussion of genetics in the section on Conditional Probability. For a given genetic trait (such as eye color or the presence of a disorder), it's usually reasonable to assume that the genotypes of the children are conditionally independent, given the genotypes of the parents. Unconditionally, however, the state of a child (for the given trait) gives information about the states of the parents, which in turn give information about the states of other children.
In the following exercise, suppose that a certain type of pea plant has either green pods or yellow pods, and that the green-pod gene is dominant. Thus, a plant with genotype \(gg\) or \(gy\) has green pods, while a plant with genotype \(yy\) has yellow pods.
Suppose that 2 green-pod plants are bred together. Suppose further that each plant, independently, has the recessive yellow-pod gene with probability \(\frac{1}{4}\).
In the following exercise, consider a sex-linked hereditary disorder associated with a gene on the X chromosome. As before, let \(h\) denote the dominant healthy gene and \(d\) the recessive defective gene. Thus, a woman of genotype \(hh\) is normal; a woman of genotype \(hd\) is free of the disease, but is a carrier; and a woman of genotype \(dd\) has the disease. A man of genotype \(h\) is normal and a man of genotype \(d\) has the disease.
Suppose that a healthy woman initially has a \(\frac{1}{2}\) chance of being a carrier. (From our discussion above, this would be the case, for example, if her mother and father are healthy but she has a brother with the disorder, so that her mother must be a carrier).
Suppose that we have \(m + 1\) coins, labeled \(0, 1, \ldots, m\). Coin \(i\) lands heads with probability \(\frac{i}{m}\) for each \(i\). In particular, note that, coin 0 is two-tailed and coin \(m\) is two-headed. Our experiment is to choose a coin at random (so that each coin is equally likely to be chosen) and then toss the chosen coin repeatedly.
The probability that the first \(n\) tosses are all heads is
\[p_{m,n} = \frac{1}{m+1} \sum_{i=0}^m \left(\frac{i}{m}\right)^n\]The conditional probability that toss \(n + 1\) is heads given that the previous \(n\) tosses were all heads is
\[\frac{p_{m,n+1}}{p_{m,n}}\]The probability \(p_{m,n}\) is an approximating sum for the integral \(\int_0^1 x^n dx\) and hence
\[p_{m,n} \to \frac{1}{n+1} \text{ as } m \to \infty\]The limiting conditional probability is
\[\frac{p_{m,n+1}}{p_{m,n}} \to \frac{n+1}{n+2} \text{ as } m \to \infty\]The limiting conditional probability in the last exercise is called Laplace's Rule of Succession, named after Simon Laplace. This rule was used by Laplace and others as a general principle for estimating the conditional probability that an event will occur on time \(n + 1\), given that the event has occurred \(n\) times in succession.
Suppose that a missile has had 10 successful tests in a row. Compute Laplace's estimate that the 11th test will be successful. Does this make sense?