--- title: "161.777 Practical Data Mining" subtitle: "Project 2025" author: "YOUR NAME GOES HERE" output: html_document --- ```{r setup, echo=TRUE, warning=FALSE, message=FALSE} # Add any other packages you need to load here. library(tidyverse) library(arules) # Read in the data ice.train <- read_csv("https://www.massey.ac.nz/~jcmarsha/161777/assessment/data/ice-train.csv") ice.test <- read_csv("https://www.massey.ac.nz/~jcmarsha/161777/assessment/data/ice-test.csv") campy.train <- read_csv("https://www.massey.ac.nz/~jcmarsha/161777/assessment/data/campy-train.csv") |> mutate(Source = factor(Source)) campy.test <- read_csv("https://www.massey.ac.nz/~jcmarsha/161777/assessment/data/campy-test.csv") wv_survey <- read_csv("https://www.massey.ac.nz/~jcmarsha/161777/assessment/data/wv_survey2014.csv") flags <- read.transactions("https://www.massey.ac.nz/~jcmarsha/161777/assessment/data/flags.txt", cols=1) ``` ## Exercise 1: Predicting ice thickness ```{r} ``` ### 1.1 Ice Thickness Methodology WRITE UP YOUR METHODOLOGY HERE ## Exercise 2: Predicting the source of *campylobacter* ```{r} ``` ### 2.1 Campylobacter Methodology WRITE UP YOUR METHODOLOGY HERE ## Exercise 3: Clustering the New Zealand World Values Survey ### 3.1 Initial exploration of the dataset ```{r} ``` ### 3.2 Normalised dataset ```{r} ``` ### 3.3 Dendrogram ```{r} ``` ### 3.4 k-means ```{r} ``` ### 3.5 Visualisation of heirarchical clustering ```{r} ``` ### 3.6 Visualisation of k-means clusters ```{r} ``` ## Exercise 4 ### 4.1 Frequency of colour use ```{r} ``` ### 4.2 Number of rules ```{r} ``` ### 4.3 Highest support rules ```{r} ``` ### 4.4 Lift greater than 2 ```{r} ``` ### 4.5 Choiseul independence flag ```{r} ```