Skip to content

Latest commit

 

History

History
140 lines (96 loc) · 6.71 KB

README.md

File metadata and controls

140 lines (96 loc) · 6.71 KB

Daily Lessons

Outline of the topics/resources for each day of class.

Note, assignments will generally be posted in Canvas. Occassionally we'll use GitHub Classroom for code-related work.

Week 1

Day 1 - Course Intro and Demystifying Dot Notation

  • Course Intro Presentation
    • Defining "advanced"
  • Demystifying Dot Notation
    • High-level discussion on dot notation and Python classes and OOP. This will be the last installment in Python basics, and builds on the foundation laid during fall and winter quarters in the core PADJ courses. You should complete the Demystifying tutorial in advance of Day 2, when there will be an in-class quiz on the material covered.
    • Complete the Elections OOP assignment by Friday. This assignment is on GitHub Classroom. See Canvas for detailed instructions.
  • Housekeeping (course resources, format, AI policy, etc.)

Day 2 - OOP Wrap-up; In-class Quiz; Intro to APIs

Week 2

Day 3 - Web Scraping

Please clone the Stanford Data Journalism Notebooks repo.

You can use VS Code to clone or the command line.

git clone [email protected]:stanfordjournalism/data-journalism-notebooks.git

Once the repo is opened locally in VS Code, navigate to content/web_scraping/README.ipynb and work through all notebooks, in order. There will be a quiz on the material next class.

NOTE: If you're new to browser developer tools, level up using materials for Chrome or Firefox (just pick one) in content/web_scraping/resources.ipynb).

Day 4 - Quiz and CLEAN Scrapers

  • Guest lecture: Katey Rusch, from the UC Berkeley Investigative Reporting Program, on the Community Law Enforcement Accountability Network (CLEAN)
  • Overview of clean-scraper code architecture and motivations
    • If you're still shaky on classes and functions, please review Data Journalism Notebook lessons on classes and OOP
  • Begin working on clean-scraper - Open source scraper contributions as weekend assignment
    • Dissect the website and craft a scraping strategy. Add your proposed strategy to the GitHub issue for the site
    • Once your scraping strategy is approved, begin implementing the code on a fork of the clean-scraper repo, per the Contributor Guidelines
  • Homework:

Week 3

Day 5 - CLEAN Scraping

Guided tour of the clean-scraper code repository, including:

  • Code architecture:
    cli -> runner -> San Diego PD scraper -> cache.download
    
  • Code conventions:
    • scrape_meta stores file artifacts in cache and produces a JSON metadata file
    • scrape reads JSON metadata and downloads files (to cache)
  • Scraping at scale, with a paper trail. - aka why the complexity?
  • Contributor Guidelines
    • Claim an agency by filing a GitHub Issue
    • Dissect your website and add proposed scraping plan to GH Issue
  • Start writing your scraper

Day 6 - CLEAN Scraping continued

  • Finalize scraping plans (make sure to file an Issue for your scraper on GitHub)
  • Work on scrapers

Week 4

Day 7 - Data Viz Intro and clean-scraper check-ins

Day 8 - Hands-on Data Viz for Watchdog Reporting investigation

  • Guest visit by Cheryl Phillips to discuss a project in her Watchdog Reporting class.
  • Finalizing open-source contributions to clean-scraper
  • Hands-on work (pot luck):
    • Wrap up Viz Curriculum if you haven't done so
    • Work on data viz for your own projects/interests
    • Begin the data viz assignment for Watchdog Reporting (see Canvas)
    • Wrap up clean-scraper contribution

Week 5

Day 9 - Data Dashboard Design & Streamlit

Day 10 - Sports Charity Analysis Hands-on

Hands-on work for sports charities analysis.

Week 6

Day 11 - Sports Analysis continued

Hands-on work for sports charities analysis, and closing the loop on clean-scraper.

Day 12 - Sports Charity Dashboards