HathiTrust |
|
|
|
|
|
sections.csv |
WIP |
Download url of the report sections on HathiTrust |
|
|
stations.csv |
WIP |
Contains info on where stations appear in the summary reports |
Created manually from RAMM station data (see RAMM/stations.csv ) |
|
/<section_name>/images/*.jpg |
|
Image of scanned pages (this folder is empty and its content are not committed. You need to download the files from HathiTrust and put the images files in these folder for some of the workflow scripts to work.) |
Downloaded from https://catalog.hathitrust.org/Record/001473257 |
|
/<section_name>/texts/*.txt |
|
OCRed text of reports, saved by page |
Downloaded from https://catalog.hathitrust.org/Record/001473257 |
Oceans 1876 |
|
|
|
|
|
data_source.json |
WIP |
A list of data sources used by Global Names and their properties |
|
|
index_species_errors.json |
WIP |
List of errors from extracting species from summary reports index (created by get_all_species.py script in challenger-workflows ) |
|
|
index_species_status.json |
WIP |
List of species status on WoRMS dataset (created by get_species_status.py script in challenger-workflows ) |
|
|
index_species_verified.json |
WIP |
List of verified species by Global names extracted from summary reports index (created by get_all_species.py script in challenger-workflows ) |
|
|
index_species.json |
WIP |
List of species extracted from summary reports index (created by get_all_species.py script in challenger-workflows ) |
|
|
species.json |
WIP |
Contains all the species found in the reports, extracted by gnames tools |
Output of workflows/update_stations.py Uses RAMM/stations.csv and HathiTrust/stations.csv |
|
stations.json |
WIP |
Contains stations environmental and species info |
Output of workflows/update_stations.py Uses RAMM/stations.csv and HathiTrust/stations.csv |
|
uri_template.json |
WIP |
A mapping of data source Ids used by Global Names to their urls |
|
Oceans1876_subset |
|
|
|
|
|
species.json |
Done |
Contains a subsample (N_stations=15) of the species found in the respective stations, sampled from Oceans1876/stations.json for creating testing data |
Output of workflows/create_test_data.py Uses Oceans1876/stations.json |
|
stations.json |
Done |
Contains a subsample (N_stations=15) of stations, environmental and species info for creating testing data |
Output of workflows/create_test_data.py Uses Oceans1876/stations.json |
RAMM |
|
|
|
|
|
stations.csv |
Needs review |
Contains stations environmental info |
Downloaded from https://www.hmschallenger.net/the-voyage/the-route/ |