Assumed knowledge

Motivation

In earlier years, students have seen different ways in which a 'sample' of data might arise or be used: surveying a sample of people, taking measurements on a sample of other objects (animals, trees, schools, companies and so on), conducting an experiment on a sample involving the random allocation of its members to different groups, and making observations on different groups or samples. This is covered by the series of modules Data investigation and interpretation Open TIMES modules in new window (Years F–10).

The context here for learning about random samples is to understand how they serve as a basis for relying on the sample data to provide quantitative information about the population from which the sample was taken. This is because, very often, the questions we ask are general in nature:

In these examples, it is impractical (impossible!) to find an exact answer, that is, to find the exact value of the quantity of interest in the population (such as the proportion of women preferring candidate J as Prime Minister). Instead, we can obtain an estimate based on a sample.

There are many reasons for using samples. Most often, the cost in time and effort prohibits gathering information from the entire population of interest. It can also be easier to ensure the information is high quality in a sample. Importantly, with high-quality data collection methods and appropriate ways of selecting a sample, we can obtain accurate information about a population. In some cases, we want to infer a property of a population based on a sample from the population: the proportion of Australians who prefer candidate J. In other cases, we may wish to make a comparison of the properties of two populations: the proportions of Australian men and women preferring candidate J.

Populations are rarely static; it may not be possible to capture an entire population because it extends into the future. In asking about vitamin D levels in Australian newborns, it is likely that we want to draw a general conclusion that applies to newborns born today as well as those born tomorrow and in the future. Often we envisage that the conclusion drawn will be relevant to all humans, including humans in the future. This involves making an assumption about the stability of the world and its patterns.

Next page - Content - Random sampling in finite populations