The sample mean \(\bar{X}\) as a point estimate of \(\mu\)

Even without using any ideas from probability or distribution theory, it seems compelling that the sample mean should tell us something about the population mean. If we have a random sample from the population, the sample should be representative of the population. So we should be able to use the sample mean as an estimate of the population mean.

We first review the definition of a random sample; this material is also covered in the module Random sampling .

Consider a random variable \(X\). The population mean \(\mu\) is the expected value of \(X\), that is, \(\mu = \mathrm{E}(X)\). In general, the distribution of \(X\) and the population mean \(\mu\) are unknown.

A random sample 'on \(X\)' of size \(n\) is defined to be \(n\) random variables \(X_1, X_2, \dots, X_n\) that are mutually independent and have the same distribution as \(X\).

We may think of the distribution of \(X\) as the underlying or 'parent' distribution, producing \(n\) 'offspring' that make up the random sample. We use the phrase 'parent distribution' throughout this module to refer to the underlying distribution from which the random samples come.

There are some important features of a random sample defined in this way:

We define the sample mean \(\bar{X}\) of the random sample \(X_1, X_2, \dots, X_n\) as

\[ \bar{X} = \dfrac{\sum_{i=1}^{n}{X_i}}{n}. \]

Once we obtain an actual random sample \(x_1, x_2, \dots, x_n\) from the random variable \(X\), we have an actual observation

\[ \bar{x} = \dfrac{\sum_{i=1}^{n}{x_i}}{n} \]

of the sample mean. We call the observed value \(\bar{x}\) a point estimate of the population mean \(\mu\).

This discussion is reminding us that the sample mean \(\bar{X}\) is actually a random variable; it would vary from one sample to the next.

As there is a distinction to be made between the random variable \(\bar{X}\) and its corresponding observed value \(\bar{x}\), we refer to the random variable as the estimator \(\bar{X}\), and the observed value as the estimate \(\bar{x}\); note the use of upper and lower case. Since both of these may be referred to as the 'sample mean', we need to be careful about which of the two is meant, in a given context.

This is exactly parallel to the situation in the module Inference for proportions , in which the 'sample proportion' may refer to the estimator \(\hat{P}\), which is a random variable, or to an observed value of this random variable, the estimate \(\hat{p}\).

In summary: The sample mean \(\bar{X}\) is a random variable.

We now explore this important fact in some detail.

Next page - Content - The sample mean as a random variable