]>
Suppose that we have a population of objects. The population could be a deck of cards, a set of people, an urn full of balls, or any number of other collections. In many cases, we simply label the objects from 1 to , so that . In other cases (such as the card experiment), it may be more natural to label the objects with vectors. In any case, is usually a subset of for some .
Our basic experiment consists of selecting objects from the population at random and recording the sequence of objects chosen:
where is the object chosen. If the sampling is with replacement, the sample size can be any positive integer. In this case, the sample space is
If the sampling is without replacement, the sample size can be no larger than the population size . In this case, the sample space consists of all permutations of size chosen from :
Show that
With either type of sampling, we assume that the samples are equally likely and thus that the outcome variable is uniformly distributed on ; this is the meaning of the phrase random sample:
Suppose again that we select objects at random from the population , either with or without replacement.
Show that any permutation of has the same distribution as itself, namely the uniform distribution on the appropriate sample space :
A sequence of random variables with the property in the last exercise is said to be exchangeable. Although this property is very simple to understand, both intuitively and mathematically, it is nonetheless very important. We will use the exchangeable property often in this chapter.
More generally, show that any sequence of of the outcome variables is uniformly distributed on the appropriate sample space:
In particular, for either sampling method, is uniformly distributed on for each .
Show that if the sampling is with replacement, is a sequence of independent random variables.
Thus, when the sampling is with replacement, the sample variables form a random sample from the uniform distribution, in the technical sense.
Show that if the sampling is without replacement, then the conditional distribution of a sequence of of the outcome variables, given the values of a sequence of other outcome variables, is the uniform distribution on the set of permutations of size chosen from the population when the known values are removed (of course, ).
In particular, and are dependent for any distinct and when the sampling is without replacement.
In many cases, particularly when the sampling is without replacement, the order in which the objects are chosen is not important; all that matters is the (unordered) set of objects:
Suppose first that the sampling is without replacement. In this case, takes values in the set of combinations of size chosen from :
Show that .
Show that is uniformly distributed over :
Hint: For any combination of size from , there are permutations of size .
If the sampling is with replacement, takes values in the collection of subsets of , of size from 1 to :
Show that .
Show that is not uniformly distributed on .
The following table summarizes the formulas for the number of samples of size chosen from a population of elements, based on the criteria of order and replacement.
| With Order | Without Order | |
|---|---|---|
| With Replacement | ||
| Without Replacement |
Suppose that a sample of size 2 is chosen from the population . Explicitly list all samples in the following cases:
A dichotomous population consists of two types of objects.
Suppose that a batch of 100 components includes 10 that are defective. A random sample of 5 components is selected without replacement. Compute the probability that the sample contains at least one defective component.
An urn contains 50 balls, 30 red and 20 green. A sample of 15 balls is chosen at random. Find the probability that the sample contains 10 red balls in each of the following cases:
In the ball and urn experiment select 50 balls with 30 red balls, and sample size 15. Run the experiment 100 times, updating after each run. Compute the relative frequency of the event that the sample has 10 red balls in each of the following cases, and compare with the respective probability in the previous exercise:
Suppose that a club has 100 members, 40 men and 60 women. A committee of 10 members is selected at random (and without replacement, of course).
Suppose that a small pond contains 500 fish, 50 of them tagged. A fisherman catches 10 fish. Find the probability that the catch contains at least 2 tagged fish.
The basic distribution that arises from sampling without replacement from a dichotomous population is studied in the section on the hypergeometric distribution. More generally, a multi-type population consists of objects of different types.
Suppose that a legislative body consists of 60 republicans, 40 democrats, and 20 independents. A committee of 10 members is chosen at random. Find the probability that at least one party is not represented on the committee. Hint: Use the inclusion-exclusion law.
The basic distribution that arises from sampling without replacement from a multi-type population is studied in the section on the multivariate hypergeometric distribution.
Recall that a standard card deck can be modeled by the product set
where the first coordinate encodes the denomination or kind (ace, 2-10, jack, queen, king) and where the second coordinate encodes the suit (clubs, diamonds, hearts, spades). The general card experiment consists of drawing cards at random and without replacement from the deck . Thus, the card is where is the denomination and is the suit. The special case is the poker experiment and the special case is the bridge experiment. Note that with respect to the denominations or with respect to the suits, a deck of cards is a multi-type population as discussed above.
In the card experiment with cards (poker), show that there are
In the card experiment with cards (bridge), show that there are
In the card experiment, set . Run the simulation 5 times and on each run, list all of the (ordered) sequences of cards that would give the same unordered hand as the one you observed.
In the card experiment, show that
In the card experiment, show that and are independent for any and .
In the card experiment, show that and are dependent. Compare this result with the previous exercise.
Suppose that a sequence of 5 cards is dealt.
Run the card experiment 500 times, updating after each run. Compute the relative frequency corresponding to each probability in the previous exercise.
Find the probability that a bridge hand will contain no honor cards that is, no cards of denomination 10, jack, queen, king, or ace. Such a hand is called a Yarborough, in honor of the second Earl of Yarborough.
Rolling fair, six-sided dice is equivalent to choosing a random sample of size with replacement from the population . Generally, selecting a random sample of size with replacement from is equivalent to rolling fair, -sided dice.
In the game of poker dice, 5 standard, fair dice are thrown.
Run the poker dice experiment 500 times, updating after each run. Compute the relative frequency of each event in the previous exercise and compare with the corresponding probability.
The game of poker dice is treated in more detail in the chapter on Games of Chance.
Supposes that we select persons at random and record their birthdays. If we assume that birthdays are uniformly distributed throughout the year, and if we ignore leap years, then this experiment is equivalent to selecting a sample of size with replacement from . Similarly, we could record birth months or birth weeks.
Suppose that a mathematics class has 30 students.
In the birthday experiment, set and . Run the experiment 100 times, updating after each run. Compute the relative frequency of each event in the previous exercise and compare to the corresponding probability.
The birthday problem is treated in more detail later in this chapter.
Suppose that we distribute distinct balls into distinct cells at random. This experiment also fits the basic model, where is the population of cells and is the cell containing the ball. Sampling with replacement means that a cell may contain more than one ball; sampling without replacement means that a cell may contain at most one ball.
Suppose that 5 balls are distributed into 10 cells (with no restrictions).
Suppose that when we purchase a certain product (bubble gum, or cereal for example), we receive a coupon (a baseball card or small toy, for example), which is equally likely to be any one of types. We can think of this experiment as sampling with replacement from the population of coupon types; is the coupon that we receive on the purchase.
Suppose that a kid's meal at a fast food restaurant comes with a toy. The toy is equally likely to be any of 5 types. Suppose that a mom buys a kid's meal for each of her 3 kids.
The coupon collector problem is studied in more detail later in this chapter.
Suppose that a person has keys, only one of which opens a certain door. The person tries the keys at random. We will let denote the trial number when the person finds the correct key.
Suppose that unsuccessful keys are discarded (the rational thing to do, of course). Show that
Suppose that unsuccessful keys are not discarded (perhaps the person has had a bit too much to drink). Show that
It's very easy to simulate a random sample of size , with replacement from . Recall that the ceiling function gives the smallest integer that is at least as large as .
Let be a sequence of be a random numbers. Recall that these are independent random variables, each uniformly distributed on the interval . Show that for simulates a random sample, with replacement, from .
It's a bit harder to simulate a random sample of size , without replacement, since we need to remove each sample value before the next draw.
Show that the following algorithm generates a random sample of size , without replacement, from .