A scraper for one of Sweden's premier legal to scrape PDF URLs and metadata from this website https://www.domstol.se/hogsta-domstolen/avgoranden/.
- Python 3.11
- ChromeDriver for selenium
- Required Python packages (requirements.txt)
- Create a .env file with the structure of the .env.template
- Run make venv to install python packages
- Run python run_dev.py to run the script
- The scraper navigates to the specified URL, applies filters, and extracts PDF URLs from the page source.
- It then fetches metadata for each PDF and saves it to metadata.json and metadata.csv.
- Latest 10 PDFs are downloaded to the specified directory.