We’ll once again be using the student roll data from:
https://www.educationcounts.govt.nz/statistics/school-rolls
Start by downloading labA06.Rmd and load it into RStudio.
https://www.massey.ac.nz/~jcmarsha/161122/labs/labA06.Rmd
We’ll be using the dplyr package in the
tidyverse to manipulate the data.
Today we’ll be looking at the group_by and
summarise functions which really shows off the power of
dplyr.
The summarise function allows you to compute a summary
(a single number) from all rows.
The group_by function allows you to perform operations
per group. e.g. using group_by(EthnicGroup) followed by a
summarise will mean the summaries are computed for each
ethnic group, rather than across the whole dataset.
We’ll also look at how to produce ‘small multiple’ plots with
ggplot2. These allow you to split a dataset into groups and
plot a small chart for each group, sharing the same axes and setup so
that you can compare how things change from group to group.
Read through the labA06.Rmd file and work on the
exercises within.