Please publish your presentation materials in a repo that you create in our workshop organization at https://github.com/intro-to-data-science-22-workshop. There's a naming convention for the repos, which should indicate the workshop number (see below), your name (one group member only), and a very brief description, all lowercase and separated by dashes. E.g.: 01-topic-lastname1-lastname2/
. For more information about your task, check out the document workshop-guidelines.pdf
in this repository.
Please try to make your presentations using R Markdown. You can use any one of the multiple slide deck options. (For what it's worth, I use the xaringan package with a modified metropolis theme for my lecture slides). Or you can output as a GitHub document or HTML document. If you choose the latter, I would request that you please include keep_md: true
in your YAML, so that it is readable directly on GitHub.
There are multiple ways to record your computer screen and voice for the presentation videos. You can record MS PowerPoint with audio and video, record a presentation on Zoom, record a presentation via MS Teams, or use open source software OBS Studio. Just make sure the format is mp4
and the entire thing is no longer than 15 minutes
! If you need to convert between video formats, I recommend the open source video transcoder HandBrake. For minimal cutting you might want to use lightweight LosslessCut.
Topics will be randomly allocated to groups of 2 students. Both of you should contribute to both the presentation and the practice session, but you can divide main responsibilities.
Workshop | Focus | Topic | Resources |
---|---|---|---|
01 | Data wrangling | Creating web APIs with plumber | a, b |
02 | Data wrangling | Cleaning dirty data with janitor | a, b |
03 | Data wrangling | Categorical variables with forcats | a, b |
04 | Data wrangling | Dates and times with lubridate | a, b |
05 | Data wrangling | Wrangling data at scale with data.table | a, b |
06 | Data wrangling | String manipulation with stringr | a, b |
07 | Visualization | Network graphs with ggraph and tidygraph | a, b |
08 | Visualization | Interactive maps with leaflet | a, b |
09 | Visualization | Interactive graphics with plotly | a, b |
10 | Data analysis | Text analysis with quanteda | a, b |
11 | Data analysis | Tidying text data with tidytext | a, b |
12 | Data analysis | Coordinate reference systems with sf | a, b |
13 | Data analysis | Geocoding with sf | a, b |
14 | Data analysis | Temporal data with tsibble and fable | a, b |
15 | Programming | Measuring and improving performance | a, b |
16 | Programming | Parallel programming with future | a, b |
17 | Workflow | Ensuring reproducibility with renv | a, b |
18 | Workflow | Establishing pipelines with targets | a, b |
19 | Workflow | Creating R packages | a, b |
20 | Publication | Publishing with Quarto | a, b |
21 | Publication | Publishing websites with GitHub pages | a, b |