Skip to content
Change the repository type filter

All

    Repositories list

    • Code and data for the research team scraping charter websites using scrapy, requests, Selenium, and wget with Python, shell, and Docker. This is the foundation of analyses into charter schools' linguistic strategies and social implications.
      Python
      811010Updated Feb 8, 2023Feb 8, 2023
    • Code for scraping obituaries from Legacy.com. 3 steps: scrape URLs & paragraphs then extract age, sex, & race
      Jupyter Notebook
      GNU General Public License v3.0
      1910Updated Dec 9, 2022Dec 9, 2022
    • code for universal web-crawling UI
      JavaScript
      0304Updated Jul 31, 2022Jul 31, 2022
    • An introduction to web-crawling/scraping for beginners with some Python know-how. Created for IC2S2 Summer 2022 by Jaren Haber, PhD
      Jupyter Notebook
      GNU General Public License v3.0
      7300Updated Jul 20, 2022Jul 20, 2022
    • Arrays of school level spending across student poverty/disadvantage for Edunomics Lab. DC and maybe other states/districts.
      Jupyter Notebook
      MIT License
      0000Updated Apr 4, 2022Apr 4, 2022
    • Code and data for research team that does text analysis: word counts, word embeddings, topic models, parsing HTML, unsupervised clustering, etc.
      Jupyter Notebook
      3300Updated Oct 28, 2021Oct 28, 2021
    • Code for managing large data sets in Python, usually with Pandas. These scripts mostly merge, filter, inspect, and count things. Developed for a charter school database of 10K+ units based on web-crawling and federal data sources (CCD, ACS, etc).
      Jupyter Notebook
      0100Updated Apr 20, 2021Apr 20, 2021
    • Replication code for "Sorting Schools: A Computational Analysis of Charter School Identities and Stratification" research article by Jaren Haber, UC Berkeley. Paper investigates the relationships between charter school and school district poverty & race, on one hand, and school ideology and academic performance, on the other.
      Jupyter Notebook
      MIT License
      1100Updated Apr 6, 2021Apr 6, 2021
    • Code that examines geographic patterns in charter school proliferation, size, performance, and especially ideology within race- and class-structured school districts and Census tracts. Key packages include matplotlib, folium, and geoplotlib.
      HTML
      MIT License
      0100Updated Apr 16, 2019Apr 16, 2019
    • This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
      Python
      MIT License
      323000Updated Dec 3, 2018Dec 3, 2018