Week 3

Data wrangling

This week focuses on bringing data into R and preparing it for analysis—importing data from common sources, understanding and diagnosing data structure, transforming variables, tidying messy datasets, and joining multiple tables into analysis-ready forms that support visualization and modeling.

Jan 27: Data import and transformation

This session will introduce the foundations of data import and transformation in R using the tidyverse. You will learn how to bring data into R from common formats such as CSV files and spreadsheets, understand how R represents data internally, and diagnose common import issues related to variable types and structure. Drawing on principles from R for Data Science, the session will focus on the concept of tidy data and the core transformation verbs used to filter, select, mutate, summarize, and reshape datasets. By the end of the session, you will be able to import raw data, transform it into a usable analytic form, and prepare datasets for visualization and modeling.

Prepare

Jan 29: Tidy and relational data

This session will focus on tidying and combining data to support analysis in R using the tidyverse. You will learn how to diagnose untidy data structures, reshape datasets into tidy form, and understand why tidy data underpins effective visualization, modeling, and reproducible workflows. Drawing on R for Data Science and Hadley Wickham’s foundational paper on tidy data, the session will introduce key tools for pivoting, separating, and uniting variables, as well as joining multiple datasets together in principled ways. By the end of the session, you will be able to restructure messy data and combine related tables into a single, analysis-ready dataset.

Prepare

⬇️: Tidying data cheatsheet
📖: R4DS Chapter 5
📓: Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10), 1–23.
📖: R4DS Chapter 19

TidyTuesday

Drops on Monday, due the following Sunday