]> Introduction
  1. Virtual Laboratories
  2. 8. Set Estimation
  3. 1
  4. 2
  5. 3
  6. 4
  7. 5

1. Introduction

The Basic Statistical Model

As usual, our starting point is a random experiment with an underlying sample space and a probability measure . In the basic statistical model, we have an observable random variable X taking values in a set S . In general, X can have quite a complicated structure. For example, if the experiment is to sample n objects from a population and record various measurements of interest, then

X X 1 X 2 X n

where X i is the vector of measurements for the i object. The most important special case occurs when X 1 X 2 X n are independent and identically distributed. In this case, we have a random sample of size n from the common distribution.

Suppose also that the distribution of X depends on a parameter θ taking values in a parameter space Θ . The parameter θ may also be vector-valued, in which case Θ k for some k and θ θ 1 θ 2 θ k .

Confidence Sets

A confidence set is a subset C X of the parameter space Θ that depends only on the data variable X , and no unknown parameters. Thus, in a sense, a confidence set is a set-valued statistic. A confidence set is an estimate of θ in the sense that we hope that θ C X with high probability. In particular, the confidence level is the smallest probability that θ C X :

θ θ C X θ Θ

Usually, we try to construct a confidence set for θ with a prescribed confidence level 1 α where 0 α 1 . Typical confidence levels are 0.9, 0.95, and 0.99. Sometimes the best we can do is to construct a confidence set whose confidence level is at least 1 α this is called a conservative 1 α confidence set for θ .

A set estimate that successfully captured the parameter vector

Note that when we run the experiment and observe the data x , the computed confidence set is C x . The true value of the parameter θ is either in this set, or is not, and we will usually never know. However, by the law of large numbers, if we were to repeat the confidence experiment over and over, the proportion of sets that contain θ would converge to θ θ C X 1 α . This is the precise meaning of the term confidence.

Next, note that the quality of a confidence set, as an estimator of θ , is based on two factors: the confidence level and the size of the set. A good estimate has small size (and hence gives tight bounds on θ ) and large confidence. However, for a given X , there is usually a tradeoff between confidence level and size--increasing the confidence level comes only at the expense of increasing the size of the set, and decreasing the size of the set comes only at the expense of decreasing the confidence level. How we measure the size of the confidence set depends on the dimension of the parameter space and the nature of the confidence set. Moreover, the size of the set is usually random, although in some special cases it may be deterministic.

Suppose that C i X is a 1 α i level confidence set for θ for i 1 2 k . Show that if α α 1 α 2 α k 1 then C 1 X C 1 X C k X is a conservative 1 α level confidence set for θ . Hint: Use Bonferroni's inequality.

In many cases, we are interested in estimating a real parameter λ λ θ taking values in an interval parameter space a b . In this context our confidence set frequently has the form

C X θ Θ L X λ θ U X

where L X and U X are statistics. In this case L X U X is called a confidence interval for λ . If L X and U X are both random, then the confidence interval is often said to be two-sided. In the special case that U X b and L X is random, L X is called a lower confidence bound for λ and the interval L X b is called an upper confidence interval for λ . In the special case that L X a and U X is random, U X is called a upper confidence bound for λ and the interval a U X is called an lower confidence interval for λ .

Suppose that L X is a 1 α level confidence lower bound for λ and that U X is a 1 β level confidence upper bound for λ . Show that if α β 1 then L X U X is a conservative 1 α β level confidence interval for λ . Hint: Use Exercise 1

Pivot Variables

You might think that it should be very difficult to construct confidence sets for a parameter θ . However, in many important special cases, confidence sets can be constructed easily from certain random variables known as pivot variables.

Suppose that V is a function from S Θ into a set T . The random variable V X θ is a pivot variable for θ if its distribution does not depend on θ . Specifically, θ V X θ B is constant in θ Θ for each B T . If we know the distribution of the pivot variable, then for a given α , we can try to find B T (that does not depend on θ ) such that

θ V X θ B 1 α

It then follows that a 1 α confidence set for the parameter is given by

C X θ Θ V X θ B A confidence set constructed from a pivot variable

Suppose now that our pivot variable V X θ , is real-valued, which for simplicity, we will assume has a continuous distribution. For p 0 1 , let v p denote the quantile of order p for the pivot variable V X θ . By the very meaning of pivot variable, v p does not depend on θ .

Show that for any p 0 1 , a 1 α level confidence set for θ is

θ Θ v α p α V X θ v 1 p α

The confidence set in Exercise 3 corresponds to 1 p α in the left tail and p α in the right tail, in terms of the distribution of the pivot variable V X λ . The special case p 12 is the equal-tailed case, the most common case.

Distribution of the pivot variable

Show that the confidence set in Exercise 3 is decreasing in α and hence increasing in 1 α (in the sense of the subset relation) for fixed p .

Specializing further, suppose that θ θ 1 θ 1 θ n is a vector of real parameters, and that we are interested in estimating one of the coordinates θ i of θ ; the other coordinates are sometimes referred to as nuisance parameters in this context. It is often the case that the real-valued pivot variable V x θ is a strictly decreasing function of θ i for each x S and for all values of the other coordinates of θ . In this setting, we can obtain a confidence set by inverting the pivot variable with respect to θ i .

In the setting above, show that the 1 α confidence set for θ in Exercise 3 can be written in the following form, where θ i is the parameter vector θ with θ i deleted:

θ Θ W X θ i v 1 p α θ i W X θ i v α p α

In words, we apply the inverse transformation to obtain bounds on θ i that depend on the data variable X , the other coordinates (nuisance parameters) of θ , and the quantiles of the pivot variable. If the other coordinates are known, then these bounds become statistics, and we have constructed a confidence interval for θ i .

For the confidence set in Exercise 3, we would naturally like to choose p that minimizes the size of the set in some sense. However this is often a difficult problem. The equal-tailed interval, corresponding to p 12 , is the most commonly used case, and is sometimes (but not always) an optimal choice.

Pivot variables are far from unique; the challenge is to find a pivot quantity whose distribution is known and which gives tight bounds on the parameter.

Suppose that V is a pivot variable for θ . If g is a function defined on the range of V and g involves no unknown parameters, show that U g V is also a pivot variable for θ .

Location-Scale Families

In the case of location-scale families of distributions, we can easily find pivot variables. Suppose that Z is a real-valued random variable with a continuous distribution that has probability density function g , and no unknown parameters. Let X μ σ Z where μ and σ 0 are parameters. Recall that the probability density function of X is given by

f μ σ x 1 σ g x μ σ ,  x

and the corresponding family of distributions is called the location-scale family associated with the distribution of Z ; μ is the location parameter and σ is the scale parameter. Generally, we are assuming that these parameters are unknown.

Now suppose that X X 1 X 2 X n is a random sample of size n from the distribution of X ; this is our observable outcome vector. For each i , let

Z i X i μ σ

Show that Z Z 1 Z 2 Z n is a random sample of size n from the distribution of Z .

In particular, note that Z is a pivot variable for μ σ , since Z is a function of X , μ , and σ , but the distribution of Z does not depend on μ and σ . Hence, any function of Z will also be a pivot variable for μ σ , (if the function does not involve the parameters). Of course, some of these pivot variables will be much more useful than others in estimating μ and σ . In the following exercises, we will explore two common and important pivot variables.

Let M X and M Z denote the sample means of X and Z , respectively. Show that M Z is a pivot variable for μ σ since

M Z M X μ σ .

Let m denote the quantile function of the pivot variable M Z . Show that for any p 0 1 , a 1 α confidence set for μ σ is

Z α p X μ σ M X m 1 p α σ μ M X m α p α σ

Show that the confidence set in Exercise 9 is a cone in the μ σ parameter space, with vertex at M X 0 and boundary lines of slopes 1 m 1 p α and 1 m α p α , as shown in the graph below. (Note, however, that both slopes might be negative or both positive.)

Confidence set

The fact that the confidence set is unbounded is clearly not good, but is perhaps not surprising; we are estimating two real parameters with a single real-valued pivot variable. However, if σ is known, the confidence set defines a confidence interval for μ . Geometrically, the confidence interval simply corresponds to the horizontal cross section at σ .

In the confidence set in Exercise 9, let p 1 and p 0 , respectively, to show that 1 α confidence sets for μ σ are

  1. Z α 1 X μ σ M X m 1 α σ μ
  2. Z α 0 X μ σ μ M X m α σ

If σ is known, then Exercise 11(a) gives a 1 α confidence lower bound for μ and Exercise 11(b) gives a 1 α confidence upper bound for μ .

Let S X and S Z denote the sample standard deviations of X and Z , respectively. Show that S Z is a pivot variable for μ σ and a pivot variable for σ since

S Z S X σ

Let s denote the quantile function of S Z . Use the pivot variable to show that for any α 0 1 and any p 0 1 , a 1 α confidence set for μ σ is

V α p X μ σ S X s 1 p α σ S X s α p α

Note that the confidence set gives no information about μ since the random variable in Exercise 13 is a pivot variable for σ alone. The confidence set can also be viewed as a bounded confidence interval for σ

Confidence set

In the confidence set in Exercise 13, let p 1 and p 0 , respectively, to show that 1 α confidence sets for μ σ are

  1. V α 1 X μ σ S X s 1 α σ
  2. V α 0 X μ σ 0 σ S X s α

The set in part (a) gives a 1 α confidence lower bound for σ and the set in part (b) gives a 1 α confidence upper bound for σ

We can intersect the confidence sets corresponding to the two pivot variables to produce conservative, bounded confidence sets.

Suppose that α β p q 0 1 4 with α β 1 . Use Exercise 1 to show that Z α p X V β q X is a conservative 1 α β confidence set for μ σ

Confidence set

The most important location-scale family is the family of normal distributions. The problem of estimation in the normal model is considered in the next section. In the remainder of this section, we will explore another important scale family.

The Exponential Distribution

Suppose X X 1 X 2 X n is a random sample of size n from the exponential distribution with scale parameter σ 0 . Let

Y i 1 n X i

Show that 2 σ Y has the chi-square distribution with 2 n degrees of freedom, and hence is a pivot variable for σ .

Note that the variable in Exercise 16 is a multiple of the variable in Exercise 8 (with μ 0 ). Thus, let g k denote the probability density function and G k the distribution function for the chi-square distribution with k degrees of freedom. In addition, for p 0 1 , let k p denote the quantile of order p for the distribution. That is, k p G k p . For selected values of k and p , k p can be obtained from the table of the chi-square distribution, from the quantile applet, or from most statistical software packages.

Show or recall that

  1. k p 0 as p 0
  2. k p as p 1
  3. p k p 1 g k k p (Hint: use the inverse function theorem of calculus.)

Show that for any α 0 1 and any p 0 1 , a 1 α confidence interval for σ is

2 Y 2 n 1 p α 2 Y 2 n α p α

Show that

  1. 2 Y 2 n 1 α is a 1 α confidence lower bound for σ .
  2. 2 Y 2 n α is a 1 α confidence lower bound for σ .

Of the two-sided confidence intervals in Exercise 18, we would naturally prefer the one with the smallest length, because this interval gives the most information about the parameter b . However, minimizing the length as a function of p is computationally difficult. The two-sided confidence interval that is typically used is the equal tailed interval obtained by letting p 12 :

2 Y 2 n 1 α 2 2 Y 2 n α 2

Try to find the p that minimizes the length of the interval in Exercise 18.