]>
As usual, our starting point is a random experiment with an underlying sample space and a probability measure . In the basic statistical model, we have an observable random variable taking values in a set . In general, can have quite a complicated structure. For example, if the experiment is to sample objects from a population and record various measurements of interest, then
where is the vector of measurements for the object. The most important special case occurs when are independent and identically distributed. In this case, we have a random sample of size from the common distribution.
Suppose also that the distribution of depends on a parameter taking values in a parameter space . The parameter may also be vector valued, in which case for some and the parameter has the form
Recall that in Bayesian analysis, the unknown parameter is treated as a random variable. Specifically, suppose that the conditional probability density function of the data vector given is denoted . Moreover, the parameter is given a prior distribution with probability density function . (The prior distribution is chosen to reflect our knowledge, if any of the parameter). The joint probability density function of the data vector and the parameter is
Next, the (unconditional) probability density function of is the function given by
if the parameter has a discrete distribution, or by
if the parameter has a continuous distribution. Finally, by Bayes' theorem, the posterior probability density function of given is
Now let be a confidence set (that is, a subset of the parameter space that depends on the data variable . but no unknown parameters). One possible definition of a level Bayesian confidence set requires that
In this definition, only is random and thus the probability above is computed using the posterior probability density function . Another possible definition requires that
In this definition, and are both random, and so the probability above would be computed using the joint probability density function . Whatever the philosophical arguments may be, the first definition is certainly the easier one from a computational viewpoint, and hence is the one most commonly used.
Let us compare the classical and Bayesian approaches. In the classical approach, the parameter is deterministic, but unknown. Before the data are collected, the confidence set (which is random) will contain the parameter with probability . After the data are collected, the computed confidence set either contains the parameter or does not, and we will usually never know which. By contrast in a Bayesian confidence set, the random parameter falls in the computed, deterministic confidence set with probability .
Suppose that is a random sample from the Bernoulli distribution with success parameter . Thus, if trial resulted in success, and if trial resulted in failure. Moreover, suppose that has a prior beta distribution with left parameter and right parameter . Denote the number of successes by
Recall that for a given value of , has the binomial distribution with parameters and .
Show that given ,
Specifically, suppose that we have a coin with an unknown probability of heads and that we give the uniform prior. We then toss the coin 10 times, observing 7 heads. Compute the 90% Bayesian confidence interval for .
Suppose that is a random sample of size from the Poisson distribution with parameter . Moreover, suppose that has a prior gamma distribution with shape parameter and scale parameter . Denote the sum of the sample values by
.Show that given ,