-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add maap biomass datasets to the API (high-level steps) #76
Comments
Each of these data products was uploaded to the MAAP bucket by users, however those "users" may have been David Minor through some implicit approval from the Biomass Harmonization group We should check it's ok to duplicate these files in the nasa-maap-data-store and API:
These 2 products are provisional so we are not going to publish them for now:
I'm assuming we don't need any of the others listed here https://github.com/MAAP-Project/biomass-dashboard-datasets/tree/main/datasets because no one has asked for them
Next steps
For NCEO Africa 2017
For GEDI Gridded Biomass L4BThese files are published from ORNL DAAC, https://cmr.earthdata.nasa.gov/search/concepts/C2244602422-ORNL_CLOUD.html
CCI BIOMASS
ICESat-2 Boreal 2020this will be updated soon so we won't publish yet NASA JPL 2020this is a provisional product we will not publish |
Stale |
For each dataset, we will follow the following steps:
Identify the dataset and what the processing needs are
Datasets in https://earthdata.nasa.gov/maap-biomass/ need to be published to the VEDA API so that users, such as the trilateral dashboard, can access them. We will also need these datasets in the NASA dashboard once we are ready to publish the biomass story.
These datasets are in https://github.com/MAAP-Project/biomass-dashboard-datasets/tree/main/datasets, presumably we want to publish all of those datasets to the staging API, but we should probably cross check with the biomass story being told for the trilateral dashboard and focus on the ones we are aware of.
Datasets were often required to be uploaded to a landing zone location, but we will have to go through each one to identify its location in the MAAP buckets, I believe most of them are in
s3://maap-landing-zone-gccops/user-added/uploaded_objects/
Question for the group will be if we want to copy the files to our "VEDA" bucket. Who should be able to access these files?
Design the metadata and publish to the Dev API
Review conventions for generating STAC collection and item metadata:
After reviewing the STAC documentation for collections and items and reviewing existing scripts for generating collection metadata (generally with SQL) and item metadata, generate or reuse scripts for your collection and a few items to publish to the testing API. There is some documentation and examples for how to generate a pipeline or otherwise document your dataset workflow in https://github.com/NASA-IMPACT/cloud-optimized-data-pipelines. We would like to maintain the scripts folks are using to publish datasets in that repo so we can easily re-run those datasets ingest and publish workflows if necessary.
If necessary, request access and credentials to the dev database and ingest and publish to the Dev API. Submit a PR with the manual or CDK scripts used to run the workflow to publish to the Dev API and include links to the published datasets in the Dev API
Publish to the Staging API
Once the PR is approved, we can merge and publish those datasets to the Staging API
The text was updated successfully, but these errors were encountered: