This assignment is assessed. Your work must be submitted by 11:59pm on April 24th 2023.
All the exercises in this assignment are concerned with a dataset on dairy production in New Zealand, from the 2014/15 through to 2019/20 seasons. The following variables are measured.
| Variable | Description |
|---|---|
Region |
Region of New Zealand. |
Island |
Which island the region is in (North, South). |
Year |
Year the season started. 2014–2019. |
Month |
Month number during the season. 1 (June) – 12 (May). |
Milk |
Milk per cow per day (litres). |
Milkfat |
Milk fat per cow per day (kg). |
Protein |
Protein per cow per day (kg). |
SCC |
Somatic cell count (000 cells/ml). |
Start by downloading the project file by Right clicking and Save File As… here:
https://www.massey.ac.nz/~jcmarsha/227212/project.Rmd
Then loading it into RStudio.
Before you start, make sure you can Knit this document to produce an HTML file from it.
In the first code block there is some code that produces a plot
of milk per cow per day for each Month. Improve the axis labels on this
plot and describe any observed trends. The MonthLabel
column may be useful.
Produce a plot of milk per cow per day by Year. Is there evidence that milk yields are increasing?
In the plot of milk by month, you should have noticed a number of outliers in May. Which region are these from? Ensure you clearly describe how you found your answer.
Produce a plot showing milk per cow per day by Region. Ideally
we’d also show which region is in which Island. Options might be
colouring or you could facet_wrap to split by island. The
scales parameter of facet_wrap might be useful
should you choose to do a facetted plot.
Create a plot to show the distribution of somatic cell counts for North and South islands. Describe the distribution and comment on whether you think there is evidence for differences between the islands.
Try repeating the above using a log transformation of the somatic cell count. Describe what you see. Which graph do you prefer?
Produce 95% confidence intervals to summarise the average somatic cell count in North and South Islands. Is there evidence of a difference between the islands?
Produce plots to compare the distribution of somatic cell counts between North and South Islands and between months. Summarise what you see.
Produce a plot to show the relationship between milk fat and milk volume. Does the relationship differ between the North and South Island?
Perform a statistical test to determine if the average milk volume per cow per day differs between North and South Island. Ensure you clearly state your conclusion, the hypothesis being tested and any P-values and that you quantify the magnitude of any difference.