GitHub - climatetree/story-scraper: Scripts to collect and post climate change stories.

Climate Tree Story Scraper

This repository contains scripts that collect, process, and upload stories about climate change to the stories microservice. It depends on two csv files, one determines the places that will be searched, they are provided in the "/split_place_name_id_csvs" folder, and one "strategy_sector_soluion.csv" that contains the 205 climate change solutions that drive the scraper.

Before starting:

This script depends on the following (install before running):

Python 3

External Libraries: google, webpreview, bson, pymongo

pip install google

pip install webpreview

pip install bson

pip install pymongo

Usage -- Collecting Stories:

python climate_tree_scraper.py your_csv.csv

Input csv must have header place,id, as supplied by the "/split_place_name_id_csvs" folder

Output files will named placeid_storynumber.json in the created output folder. Expect about 5 seconds of runtime per story.

Usage -- Processing and Posting Stories:

python filter_and_combine_stories.py

This filters out bad stories and combines duplicates, placing them in the /filtered_stories folder that will be created if it doesn't exist. Once it is done run upload_stories.py, see note about database connection.

python upload_stories.py

This posts each story in the stories.json file output by filter_and_combine_stories.py

Input Data:

Each file contains 50 rows and is named place_name_id_n.csv where n indicates the file number, lower numbers correspond to more populous places.

Database Connection:

The database URL has been removed for security reasons, update it at the top of upload_stories.py to connect to your database before uploading stories or the upload will fail.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
split_place_name_id_csvs		split_place_name_id_csvs
.gitignore		.gitignore
README.md		README.md
climate_tree_scraper.py		climate_tree_scraper.py
filter_and_combine_stories.py		filter_and_combine_stories.py
place_name_id.csv		place_name_id.csv
strategy_sector_solution.csv		strategy_sector_solution.csv
upload_stories.py		upload_stories.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Climate Tree Story Scraper

Before starting:

Usage -- Collecting Stories:

Usage -- Processing and Posting Stories:

Input Data:

Database Connection:

About

Releases

Packages

Languages

climatetree/story-scraper

Folders and files

Latest commit

History

Repository files navigation

Climate Tree Story Scraper

Before starting:

Usage -- Collecting Stories:

Usage -- Processing and Posting Stories:

Input Data:

Database Connection:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages