- Notes:
- Tentative calendar (weekly topics), subject to changes depending on the pace of the course.
- Notes (:file_folder:) involves material discussed in class.
- Reading (:book:) involves material that expands lecture topics, as well as coding examples that you should practice on your own.
- Misc (:newspaper:) is supporting material that is worth taking a look at.
- 📇 Dates: Jan 17-19
- 📎 Topics: Introduction, course in a nutshell, and policies/logistics. Please spend some time outside class to review the course policies, piazza etiquette rules, as well as the FAQs.
- 📁 Notes:
- About the Course (slides)
- Introduction: Big Picture (slides)
- 📖 Reading:
- 🔬 Lab: No lab
- 📰 Misc:
- 🔈 To Do:
- 📇 Dates: Jan 22-26
- 📎 Topics: First things first, we begin with some basic survival skills for R, followed by an overall review of the RStudio workspace. Then we move on to discuss basic data types and their implementation in R around vectors. Likewise, we cover fundamental concepts like atomicity, vectorization, recycling, and subsetting.
- 📁 Notes:
- First contact with R (tutorial)
- Intro to Rmd files (tutorial)
- Data Types and Vectors (slides)
- 📖 Reading:
- www.markdowntutorial.com
- Markdown tutorial (by CommonMark)
- 🔬 Lab:
- 📰 Misc:
- Introduction to R Markdown (by RStudio)
- 💡 Cheat sheet:
- 🎯 WARM-UP 1:
- Markdown practice (due Feb-02)
- 📇 Dates: Jan 29-Feb 02
- 📎 Topics: Review of more data structures like arrays and lists. Discussion of the traditional base graphics approach that is based on R vectors.
- 📁 Notes:
- Arrays and Factors (slides)
- Lists (slides)
- Base Graphics I (slides)
- Base Graphics II (slides)
- 📖 Reading:
- Intro to vectors (tutorial)
- 🔬 Lab:
- 📰 Misc:
- chapter 20: Vectors (R for Data Science by Grolemund and Wickham)
- 💡 Cheat sheet:
- 🎯 WARM-UP 2:
- Vectors and Factors (due Feb-09)
- 📇 Dates: Feb 05-09
- 📎 Topics: Data Analysis Projects (DAPs) are made of files and directories. Therefore, we need to review some fundamental concepts such as the file-system, command line, and basics of version control systems.
- 📁 Notes:
- Filesystem Basics (slides)
- Shell Basics (slides)
- Working with files (slides)
- Git Basics (slides)
- 📖 Reading:
- The Unix Shell lessons 1-3 (by Software Carpentry)
- Linux Tutorial lessons 1-5 (by Ryan Chadwick)
- 🔬 Lab:
- 📰 Misc:
- Read sections 4 to 9 in Part I Installation (Happy Git and GitHub for the useR by Jenny Bryan et al.)
- 💡 Cheat sheet:
- 📇 Dates: Feb 12-16
- 📎 Topics: Tables are the most common form in which data is stored, handled, and manipulated. Consequently, we need to talk about the typical storage formats of tabular data, and the relationship between tables and R data frames. In addition, we cover Principal Components Analysis (PCA) which is an unsupervised learning technique for summarizing the systematic structure of a table consisting of quantitative variables.
- 📁 Notes:
- Data Tables (slides)
- Importing Tables in R (slides)
- Principal Component Analysis 1 (slides)
- Principal Component Analysis 2 (slides)
- 📖 Reading:
- Basic manipulation of Data Frames (slides)
- Organizing data in spreadsheets (by Karl Broman)
- 🔬 Lab:
- 📰 Misc:
- Data Import (R for Data Science by Grolemund and Wickham)
- 💡 Cheat sheet:
- 🎯 HW 1: due Feb-23
- 📇 Dates: Feb 19-23 (Holiday Feb-19)
- 📎 Topics: We continue reviewing manipulation of data frames with an introduction to the data plying framework provided by the package
"dplyr"
. Likewise, we begin reviewing the visualization paradigm of"ggplot2"
which is based on data frames. - 📁 Notes:
- "dplyr" tutorial slides (by Hadley Wickham)
- Grammar of Graphics framework (slides)
- 📖 Reading:
- "ggplot2" lecture (by Karthik Ram)
- 🔬 Lab:
- 📰 Misc:
- tibbles vignette
- Introduction to dplyr (by Hadley Wickham)
- 💡 Cheat sheet:
- 📇 Dates: Feb 26-Mar 02
- 📎 Topics: We continue reviewing more aspects of
"dplyr"
and the famous pipe operator. - 📁 Notes:
- Pipes with
"dplyr"
(tutorial) - Shell input/output redirection (tutorial)
- Shell filters (tutorial)
- Pipes with
- 📖 Reading:
- Pipes (R for Data Science by Grolemund and Wickham)
- 🔬 Lab:
- 📰 Misc:
- Tidy Data (by Hadley Wickham)
- 💡 Cheat sheet:
- 🎯 HW 2: due Mar-09
- 📇 Dates: Mar 05-09
- 📎 Topics: You don’t need to be an expert programmer to be a data scientist, but learning more about programming allows you to automate common tasks, and solve new problems with greater ease. We'll discuss how to write basic functions, the notion of R expressions, and an introduction to conditionals.
- 📁 Notes:
- Creating functions (tutorial)
- Introduction to functions (tutorial)
- Introduction to R expressions and conditionals (tutorial)
- 🔬 Lab:
- 📰 Misc:
- chapter 19: Functions (R for Data Science by Grolemund and Wickham)
- 🎓 MIDTERM 1: Friday Mar-09
- 📇 Dates: Mar 12-16
- 📎 Topics: In addition to writing functions to reduce duplication in your code, you also need to learn about iteration, which helps you when you need to do the same operation several times. Namely, we review control flow structures such as
for
loops,while
loops,repeat
loops, and theapply
family functions. - 📁 Notes:
- Introduction to loops (tutorial)
- More about functions (tutorial)
- Functions (Advanced R by H. Wickham)
- 🔬 Lab:
- 📰 Misc:
- chapter 21: Iteration (R for Data Science by Grolemund and Wickham)
- 🎯 HW 3: due Mar-23
- 📇 Dates: Mar 19-23
- 📎 Topics: At its heart, computing involves working with numbers. However, a considerable amount of information and data is in the form of text. Therefore, you also need to learn about character strings, and how to perform basic manipulation of strings. In parallel, we'll keep working on writing funtions, especially focusing on testing functions.
- 📁 Notes:
- Environments (Advanced R by H. Wickham)
- Intro to testing functions (tutorial)
- Character strings in R (r4strings by Sanchez)
- Basic string manipulations (r4strings by Sanchez)
- 📖 Reading:
- testthat: Get started with testing (by Wickham)
- 🔬 Lab:
- 📰 Misc:
- chapter 14: Strings (R for Data Science by Grolemund and Wickham)
- 💡 Cheat sheet:
- 📇 Dates: Mar 26-30
- 🔋 (Re)charge your batteries!
- 📇 Dates: Apr 02-06
- 📎 Topics: To unleash the power of strings manipulation, we need to take things to the next level and learn about Regular Expressions. Namely, Regular expressions are a tool that allows us to describe a certain amount of text called "patterns". We'll describe the basic concepts of regex and the common operations to match text patterns.
- 📁 Notes:
- Regexpal tester tool.
- Introduction to regular expressions
- 📖 Reading:
- Handling Strings in R (by Sanchez)
- 🔬 Lab:
- 💡 Cheat sheet:
- 🎯 HW 4: due Apr-13
- 📇 Dates: Apr 09-13
- 📎 Topics: Random numbers have many applications in science and computer programming, especially when there are significant uncertainties in a phenomenon of interest. In this part of the course we'll look at some basic problems involving working with random numbers and creating simulations.
In order to better visualize the results of some simulations, we will briefly discuss Shiny apps. This type of apps are a nice companion to R, making it quick and simple to deliver interactive analysis and graphics on any web browser. We'll review how to create simple shiny apps to display data summaries, queries, and interactive displays.
- 📁 Notes:
- Introduction to random numbers
- Coin toss shiny app
- shiny tutorial (by Grolemund)
- 📖 Reading:
- 🔬 Lab:
- 📰 Misc:
- 💡 Cheat sheet:
- 📇 Dates: Apr 16-20
- 📎 Topics: Packages are the fundamental units of reproducible R code. They include reusable functions, the documentation that describes how to use them, and sample data. In this part we'll start describing how to turn your code into an R package.
- 📁 Notes:
- Programming S3 Classes
- Pack YouR Code (by Sanchez)
- 📖 Reading:
- Package Structure (R packages by Wickham)
- See package components: http://r-pkgs.had.co.nz/ (R packages by Wickham)
- 🔬 Lab:
- 💡 Cheat sheet:
- 🎯 HW 5: due Apr-27
- 📇 Dates: Apr 23-27
- 📎 Topics: Creating an R package can seem overwhelming at first. So we'll keep working on the creation of a relatively basic package. This will give you the opportunity to apply most of the concepts seen in the course.
- 📁 Notes:
- Pack YouR Code (by Sanchez)
- 📖 Reading:
- See package components: http://r-pkgs.had.co.nz (R packages by Wickham)
- 🔬 Lab:
- TBA
- 💡 Cheat sheet:
- 📇 Dates: Apr 30-May 04
- 📎 Topics: Prepare for final examination
- 📁 Notes:
- No lecture. Instructor will hold OH (in 309 Evans)
- 🎓 FINAL: Mon May 7, 8-11am, Dwinelle 145 and 155
- See announcement about the final test on bCourses