]>
As usual, our starting point is a random experiment with an underlying sample space and a probability measure . In the basic statistical model, we have an observable random variable taking values in a set . Recall that in general, this variable can have quite a complicated structure. For example, if the experiment is to sample objects from a population and record various measurements of interest, then the data vector has the form
where is the vector of measurements for the object. The most important special case is when are independent and identically distributed (IID). In this case is a random sample of size from the distribution of an underlying measurement variable .
Recall also that a statistic is an observable function of the outcome variable of the random experiment:
Thus, a statistic is simply a random variable derived from the observation variable , with the assumption that is also observable. As the notation indicates, is typically also vector-valued.
In the general sense, a parameter is a function of the distribution of , taking values in a parameter space . Typically, the distribution of will have real parameters of interest, so that has the form
and thus . In many cases, one or more of the parameters are unknown, and must be estimated from the data variable . This is one of the of the most important and basic of all statistical problems, and is the subject of this chapter.
Suppose now that we have an unknown real parameter taking values in a parameter space . A real-valued statistic that is used to estimate is called, appropriately enough, an estimator of . Thus, the estimator is a random variable and hence has a distribution, a mean, a variance, and so on. When we actually run the experiment and observe the data , the observed value (a single number) is the estimate of the parameter .
The (random) error is the difference between the estimator and the parameter: . The expected value of the error is known as the bias:
Use basic properties of expected value to show that
Thus, the estimator is said to be unbiased if the bias is 0 for all , equivalently if the expected value of the estimator is the parameter being estimated: for all . The quality of the estimator is usually measured by computing the mean square error:
Use basic properties of expected value and variance to show that
In particular, if the estimator is unbiased, then the mean square error of is simply the variance of .
Ideally, we would like to have unbiased estimators with small mean square error. However, this is not always possible, and Exercise 2 shows the delicate relationship between bias and mean square error. In the next section we will see an example with two estimators of a parameter that are multiples of each other; one is unbiased, but the other has smaller mean square error. However, if we have two unbiased estimators of , denoted and , we naturally prefer the one with the smaller variance (mean square error). The relative efficiency of to is simply the ratio of the variances:
Often we have a general formula that defines an estimator of for any sample size . Technically, this gives a sequence of real-valued estimators of :
In this case, we can discuss the asymptotic properties of the estimators as . Most of the definitions are natural generalizations of the ones above. First, the sequence of estimators is said to be asymptotically unbiased if
Show that is asymptotically unbiased if and only if as for any .
Suppose now that and are two sequences of estimators that are asymptotically unbiased for . The asymptotic relative efficiency of to is the following limit, if it exists:
Naturally, we expect our estimators to improve, in some sense, as the sample size increases. Specifically, the sequence of estimators is said to be consistent for if as in probability.:
Suppose that as for any . Show that is consistent for . Hint: Use Markov's inequality.
The condition in Exercise 4 is known as mean-square consistency. Thus, mean-square consistency implies simple consistency. This is simply a statistical version of the theorem that states that mean-square convergence implies convergence in probability.
In the next several subsections, we will review several basic estimation problems that were studied in the chapter on Random Samples.
Suppose that is a random sample of size from the distribution of a real-valued random variable that has mean and standard deviation . A natural estimator of the distribution mean is the sample mean, defined by
Show or recall that
In the sample mean experiment, set the sampling distribution to gamma. Increase the sample size with the scroll bar and note graphically and numerically the unbiased and consistent properties. Run the experiment 1000 times updating every 10.
Run the normal estimation experiment 1000 times, updating every 10 runs, for several values of the parameters. In each case, compare the empirical bias and mean square error of with the theoretical values.
The consistency of the sample mean as an estimator of the distribution mean is simply the weak law of large numbers. Moreover, there are a number of important special cases of the results in Exercise 5. See the section on Sample Mean for the details.
In matching experiment, the random variable is the number of matches. Run the simulation 1000 times updating every 10 runs and note the apparent convergence of
As in the last subsection, suppose that is a random sample of size from the distribution of a real-valued random variable that has mean and standard deviation . We will also assume that the fourth central moment is finite.
If is known (usually an artificial assumption), then a natural estimator of is a special version of the sample variance, defined by
Show or recall that
If is unknown (the more reasonable assumption), then a natural estimator of the distribution variance is the standard version of the sample variance, defined by
Show or recall that
Run the exponential experiment 1000 times with an update frequency of 10. Note the apparent convergence of the sample standard deviation to the distribution standard deviation.
Show that
Run the normal estimation experiment 1000 times, updating every 10 runs, for several values of the parameters. In each case, compare the empirical bias and mean square error of and of to their theoretical values. Which estimator seems to work better?
For an example of the ideas in the last two subsections, suppose that has the Poisson distribution with unknown parameter . Then , so that we could use either the sample mean or the sample variance as an estimator of . Both are unbiased, so which is better? Naturally, we use mean square error as our criterion.
Show that
Show that
Run the Poisson experiment 100 times, updating every run, for several values of the parameter. In each case, compute the estimators and . Which estimator seems to work better?
Suppose that is a random sample of size from the distribution of , where is a real-valued random variable with mean and standard deviation , and where is a real-valued random variable with mean and standard deviation . Let denote the covariance of . As usual, we will let and ; these are random samples of size from the distributions of and , respectively.
If and are known (usually an artificial assumption), then a natural estimator of the distribution covariance is a special version of the sample covariance, defined by
Show or recall that
If and are unknown (usually the more reasonable assumption), then a natural estimator of the distribution covariance is the usual version of the sample covariance, defined by
Show or recall that
The estimators of the mean, variance, and covariance that we have considered in this section have been natural in a sense. However, for other parameters, it is not clear how to even find a reasonable estimator in the first place. In the next several sections, we will consider the problem of constructing estimators. Then we return to the study of the mathematical properties of estimators, and consider the question of when we can know that an estimator is the best possible, given the data.