Errors in hypothesis testing
The structure and logic of testing a null hypothesis needs to be kept in mind when interpreting the \(P\)-value. There are some common errors in interpretation and a potential error in the inference made that we discuss here.
An error in inference
Consider interpreting the scenario discussed in "Interpreting the \(P\)-value" (see Figure 2) as indicating that the true mean weight is not 95 g. Figure 2 reminds us of an important point. Observing a mean weight of 95.6 g or more extreme is relatively unlikely if the true mean is 95 g, but it is not impossible. Unusual sample means will arise from time to time. So interpreting extreme results (relative to the null hypothesis) as indicating that the true mean weight is not 95 g runs a (small) risk of being wrong. This is an error in the inference made. This might seem a little worrying, and of course, we do not know we've made this error; the true population mean is unknown to us. We simply know that this error is a possibility if we choose to believe the true mean weight is not 95 g. However in making inferences about population parameters we should not simply rely on interpreting the \(P\)-value in this way. Inference is best based on considering both the \(P\)-value and the confidence interval.
Here, for example, is how we might interpret the hypothesis test and confidence interval for the mean weight of the cans of tuna, for the data considered in Figure 2. A mean weight of 95.6 g for a random sample of 25 cans of tuna was observed. The \(P\)-value for a hypothesis test of a population mean weight of 95 g was 0.012. This is the probability of observing a mean weight at a distance 0.6 g or more from a population mean of \(\mu = 95\)g, if that is the true value of \(\mu\). The observation is surprising, if the true mean is 95 g. A 95% confidence interval suggests that plausible values of the true mean weight are between 95.1 g and 96.1 g.