- Crafting regular expressions
- Scraping newspaper headlines
https://www.theguardian.com/international
You will be able to earn a total number of points that is given in the assignment description. Based on the points you earned your grade will be rescaled to a 0-100 points scale.
Grades will be based on the following criteria:
- Timely submission. For each day the homework assignments and the final data science project is turned in late, the grade will be reduced by 10% (e.g. submission two days after the deadline would result in 20% grade deduction).
- The submission of a properly knitted solution. We will grade your assignment on the basis of the knitted HTML (not PDF!). If you fail to provide the HTML, we will fall back to the Rmd. Any output that is not visible in the script (e.g., printed R output that is only rendered in the knitted document) will not be regarded as a submitted solution. We are not going to knit the Rmd on our devices.
- The accuracy of your solutions. You should clearly indicate what you submit as your solution for every single task. Code that produces a solution is not sufficient. The result itself (e.g., a computed mean or a plot) has to be shown too. If the solution is immediately obvious from the output, showing the output is sufficient (i.e. when the task asks for the calculation of a mean, printing the mean is fine - you don't have to add a verbal statement like, e.g. "The average weight for dogs is 5kg.")
- The adherence to a clean and efficient coding style. Write the code with other users/readers in mind. Adhere to the guidelines provided in the tidyverse sessions. That being said, multiple ways lead to the right solution, and you will not be punished for the use of base R or a deviation from the coding style guidelines as long as you consistenly apply a style that serves the purpose of making your code accessible and efficient.
- Proper layouting of your output. Figures need meaningful descriptions. Axes need meaningful labels. The use of colors serves a purpose. Tables need meaningful headers. Round wisely - nobody needs more than 3 decimal points, and often less is sufficient. Every task deserves its own space. Your comments and explanations should be clearly distinguishable from other content of your script.