-
Notifications
You must be signed in to change notification settings - Fork 110
inventory.data.gov
a.k.a Inventory is used by federal agencies to manage metadata for their datasets. Inventory is used to generate the agency's data.json which must be hosted on the agency's website (e.g. agency.gov/data.json). Inventory is a CKAN instance and can be used to host datasets in addition to metadata.
Instance | Url |
---|---|
Production | inventory.data.gov |
Staging | inventory-datagov.dev-ocsit.bsp.gsa.gov |
ci | inventory.ci.datagov.us |
Sub-components:
- ckan
- datapusher
Services:
- apache2
- rds
- redis
- s3
- solr
- /var/log/inventory/ckan.custom.log
- /var/log/inventory/ckan.error.log
- /var/log/inventory/datapusher.custom.log
- /var/log/inventory/datapusher.error.log
ckanpyimport is used in onboarding new agencies to inventory.data.gov. This tool imports datasets from a data.json file.
The import script will happily create duplicates, so if there are any existing datasets in the organization, you probably should delete them all first.
Run this from the jumpbox using nohup
or tmux
so that disconnecting your session does not interrupt the script. The script can take a while depending on how many packages need to be imported (~2 hours for 1000 datasets). You should also test against staging before running against production.