Assignment II for Biostats Course VHM 801 at AVC - Fall semester 2021
The assignment is worth 10% of the final course mark. Please be aware that by handing
in the home assignment you implicitly acknowledge to have read and accepted
the instructions for home assignments as described
on the VHM 801 homepage.
A laboratory routinely carries out measurements of the fibre content (in
percent) in soy-bean cakes (animal feed).
Two laboratory technicians, one experienced and one relatively unexperienced in this
particular type of analysis, have analysed (different) samples of a
particular batch submitted for testing.
Their measurements are given in the listing below (also available in Minitab format and as a comma-separated file, for
import into Stata and other statistical software). The technicians are
labelled 1 and 2, and it is not revealed which of them has more
experience.
Laboratory technician 1:
12.28 12.53 12.25 12.37 12.48 12.58 12.43 12.30 12.46 12.43
Laboratory technician 2:
12.25 12.45 12.31 12.41 12.30 12.20 12.25 12.64 12.26 12.73 12.42 12.17 12.09 12.23
The assignment has five questions to be answered. Make sure to include
in your answer information about your model assumptions when ever that would be
relevant.
-
For each of the technicians, compute a 99% confidence interval of the
fibre content in the batch, and interpret your interval(s)
carefully. Include in your interpretation whether you consider the
interval(s) as approximate or exact.
-
The (mean) fibre content for the batch was stated by the producer to be 12.36%.
Based on the results for each technician, give a statistical assessment
of whether each technician's measurements seem to be in agreement with the producer's value.
State your conclusion(s) carefully.
-
There was concern that the less experienced technician could have a bias in the
measurements, that is, a higher or lower measured fibre content (beyond chance variation)
than a more experienced technician.
Carry out a statistical test to determine whether
there is evidence of such a bias in the data.
-
Further investigation of the two technicians' measurements was based on
the expected range (or interval) for a single measurement. For this purpose, we
will assume the producer's stated mean content (of 12.36%) to be correct and, based on past experience,
that laboratory measurements of this type are approximately
normally distributed with a laboratory error (standard deviation) of 0.10%.
Use these assumptions to compute a 95% range (interval) for any single
measurement, valid for both technicians. That is, with the assumptions
made about the distribution of measurements, there should be a 95%
probability that any (new) measurement falls inside the interval.
Next, determine for each technician the number of their actual measurements falling
outside the expected interval. Finally, use these counts to test (separately for each
technician) whether the measurements agree with the model/assumptions, in the sense that the proportion
of measurements outside the interval agrees (beyond chance variation) with what would
be expected from the model.
(Hint: First determine
the distribution of the number of measurements that fall outside the interval, if the
stated model is correct. Then set up a statistical test (with suitable hypotheses) in that
distribution, and compute a P-value using the general principles of statistical tests.)
-
Summarise the results of your investigations in the questions above into a
conclusion about the performance of each technician. In this part, include also
include any other features of the distributions of the measurements by the
technicians that you find relevant. You may also try to guess which of the two
technicians is the less experienced one (note: you are not required to guess
correctly to receive a full mark for this question).
Henrik Stryhn
(hstryhn@upei.ca) 2021-10-17