This repository contains scripts that collect, process, and upload stories about climate change to the stories microservice. It depends on two csv files, one determines the places that will be searched, they are provided in the "/split_place_name_id_csvs" folder, and one "strategy_sector_soluion.csv" that contains the 205 climate change solutions that drive the scraper.
This script depends on the following (install before running):
Python 3
External Libraries: google, webpreview, bson, pymongo
pip install google
pip install webpreview
pip install bson
pip install pymongo
python climate_tree_scraper.py your_csv.csv
Input csv must have header place,id, as supplied by the "/split_place_name_id_csvs" folder
Output files will named placeid_storynumber.json in the created output folder. Expect about 5 seconds of runtime per story.
python filter_and_combine_stories.py
This filters out bad stories and combines duplicates, placing them in the /filtered_stories folder that will be created if it doesn't exist. Once it is done run upload_stories.py, see note about database connection.
python upload_stories.py
This posts each story in the stories.json file output by filter_and_combine_stories.py
Each file contains 50 rows and is named place_name_id_n.csv where n indicates the file number, lower numbers correspond to more populous places.
The database URL has been removed for security reasons, update it at the top of upload_stories.py to connect to your database before uploading stories or the upload will fail.