Skip to content

This repo was created in support of the IDS workshop 2022. The authors will present data wrangling at scale with data.table

Notifications You must be signed in to change notification settings

intro-to-data-science-22-workshop/05-data.table-Smolica-Oueslati

Repository files navigation

Working with data.table

Summary

This repository provides materials for a session that is part of the I2DS Tools for Data Science workshop run at the Hertie School, Berlin in November 2022. The student-run workshop is part of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2022.

Session contents

This session will help you dive into data wrangling with data.table. data.table is an R package that provides an enhanced version of data.frames. Essentially, data.table is a Swiss Army Knife for the entire suit of data wrangling tasks. Importantly, data.table is extremely performance-oriented, making it fast and memory-efficient. Especially for large datasets, data.table outperforms all comparable packages. In the session, we will introduce the mechanics of data.table along a logical sequence of data wrangling tasks. While a new syntax can always seem intimidating at first, it is well worth picking up some data.table basics if you plan to work with big data in R.

Main learning objectives

There are 5 learning objectives for this session. (1) Grasp the use cases and strengths of data.table, (2) Understand data.table general semantics, (3) Learn to use data.table across different data wrangling tasks, (4) Practically apply what you learnt through exercises, (5) Know how to independently continue your learning journey

Instructors

Further resources

License

The material in this repository is made available under the MIT license.

Statement of contributions

Gresa Smolica prepared the script and practice materials.

Amin Oueslati prepared the presentation and the script.

About

This repo was created in support of the IDS workshop 2022. The authors will present data wrangling at scale with data.table

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published