Content

Standardising the sample mean

The module Exponential and normal distributions shows how any Normal distribution can be standardised, in the following way, to give a standard Normal distribution:

If \(Y \stackrel{\mathrm{d}}{=} \mathrm{N}(\mu,\sigma^2)\) and \(Z = \dfrac{Y-\mu}{\sigma}\), then \(Z \stackrel{\mathrm{d}}{=} \mathrm{N}(0,1)\).

The standard Normal distribution has mean 0 and variance 1. A random variable with this distribution is usually denoted by \(Z\). That is, \(Z \stackrel{\mathrm{d}}{=} \mathrm{N}(0,1)\).

Consider a standardisation of \(\bar{X}\). We subtract off the mean of \(\bar{X}\), which is \(\mu\), and divide through by the standard deviation of \(\bar{X}\), which is \(\dfrac{\sigma}{\sqrt{n}}\), to obtain a standardised version of the sample mean:

\[ \dfrac{\bar{X} - \mu}{\sigma/\sqrt{n}}. \]

Now we ask: What is the distribution of this quantity?

Sampling from a Normal distribution

We first consider the case of a random sample from a Normal population, say the population of study scores \(\mathrm{N}(30,7^2)\).

The standardisation of \(\bar{X}\) for this example is illustrated in figure 23. There are nine distributions in figure 23.

Of course, all nine distributions in figure 23 are Normal distributions. As we saw in a previous section Sampling from symmetric distributions, if the parent distribution from which we are sampling is Normal, then the distribution of the sample mean is itself Normal, for any \(n\).

Standardisation of the distribution of X bar for samples from a Normal distribution, for various values of n.

Figure 23: Standardisation of the distribution of \(\bar{X}\) for samples from a Normal distribution, for various values of \(n\).

In summary: For a random sample of size \(n\) from a Normal distribution,

\[ \dfrac{\bar{X} - \mu}{\sigma/\sqrt{n}} \stackrel{\mathrm{d}}{=} \mathrm{N}(0,1). \]

Under the specific conditions of sampling from a Normal distribution (and only then), this result holds for any value of \(n\).

Sampling from the uniform distribution

Now consider the distribution of the sample mean for random samples from the uniform distribution \(\mathrm{U}(0,1)\). We illustrate this in figure 24 with simulations of 100 000 samples.

Standardisation of the distribution of X bar for samples from a uniform distribution, for various values of n.

Figure 24: Standardisation of the distribution of \(\bar{X}\) for samples from a uniform distribution, for various values of \(n\).

This shows via simulation the application of the central limit theorem to the uniform distribution: for a random sample of size \(n\) from the uniform distribution, if \(n\) is large, then

\[ \dfrac{\bar{X} - \mu}{\sigma/\sqrt{n}} \stackrel{\mathrm{d}}{\approx} \mathrm{N}(0,1). \]

Sampling from the exponential distribution

Next we consider standardisation of the distribution of sample means for samples from the exponential distribution with mean 7; see figure 25. This figure is based on the true distribution of the sample mean, as in this case it can be derived explicitly. (So we do not need to rely on histograms of sample means from many random samples to get an approximate idea of the distributions involved.)

Standardisation of the distribution of X bar for samples from the exponential distribution exp(1/7).
Detailed description

Figure 25: Standardisation of the distribution of \(\bar{X}\) for samples from the exponential distribution \(\exp(\tfrac{1}{7})\), for various values of \(n\).

We have already seen the distribution of the sample mean \(\bar{X}\) based on random samples of size \(n=10\) from \(\exp(\dfrac{1}{7})\), in the section Sampling from asymmetric distributions (see figure 18). For the case \(n = 10\), the value of \(n\) is small and, although the distribution of \(\bar{X}\) is much more symmetric that the distribution of \(X\) itself, some skewness is still apparent.

Now, in figure 25, we look at considerably larger sample sizes.

For these larger values of \(n\), can you still detect some skewness visually? Are these distributions symmetric? There is some slight skewness apparent… but you have to look hard! The distribution is approximately Normal, and the approximation is quite good for these large values of \(n\).

Keep in mind how good this approximation is for these values of \(n\), given the substantial skewness of the parent exponential distribution.

This shows the application of the central limit theorem to the exponential distribution: for a random sample of size \(n\) from the exponential distribution, if \(n\) is large, then

\[ \dfrac{\bar{X} - \mu}{\sigma/\sqrt{n}} \stackrel{\mathrm{d}}{\approx} \mathrm{N}(0,1). \]

We have shown examples for the uniform and exponential distributions, but the conditions of the central limit theorem are completely general: it works for any distribution with a finite mean \(\mu\) and finite variance \(\sigma^2\).

The Normal approximation described here is used later, when we obtain an approximate confidence interval for the unknown population mean \(\mu\), based on a random sample. Before getting to the practicalities, however, we consider some very important general ideas about confidence intervals.

Next page - Content - Population parameters and sample estimates