This repository provides materials for a session that is part of the I2DS Tools for Data Science workshop run at the Hertie School, Berlin in November 2022. The student-run workshop is part of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2022.
This session will introduce you to the data cleaning and exploring with the janitor package in R. Data cleaning and exploring are core steps in the data science workflow. janitor is a package with simple but powerful functions for cleaning and examining data that are optimized for user-friendliness.
The goals of this session are to introduce you to the functions of janitor for (1) cleaning data, and for (2) data exploration and (3) provide you with practice material as well as some further resources.
- Abigail Pena Alejos
- Nikolina Klatt
Original sources:
Further resources:
- exploringdata.org - How to Clean Data: {janitor} Package
- towardsdatascience.com - Cleaning and Exploring Data with the “janitor” Package
- jenrichmond.rbind.io - Cleaning penguins with the janitor package
The material in this repository is made available under the MIT license.
Abigail Pena Alejos prepared the practice material.
Nikolina Klatt prepared the presentation and the video.