-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STAC Collection Creation Conventions (Dashboard Specific) #29
Comments
@abarciauskas-bgse @jvntf @slesaad Can you weigh in on these tickets for STAC metadata conventions in regards to the data ingests we are doing and point out anything that should be adjusted or added? We are definitely going to need to make adjustments for datetimes (start/end vs nominal datetime), anything else? |
This is really great @anayeaye your table and examples are 💯 Questions about naming of some of the fields:
Additional questions about values in
Is this a valid example of the conventions you are proposing: {
"id": "OMSO2PCA",
"type": "Collection",
"links": [],
"title": "OMSO2PCA",
"extent": {
"spatial": {
"bbox": [
[
-180,
-90,
180,
90
]
]
},
"temporal": {
"interval": [
[
"2005-01-01T00:00:00Z",
"2021-01-01T00:00:00Z"
]
]
}
},
"license": "MIT",
"description": "OMI/Aura Sulfur Dioxide (SO2) Total Column L3 1 day Best Pixel in 0.25 degree x 0.25 degree V3",
"stac_version": "1.0.0",
"summaries": {
"datetime": [
"2005-01-01T00:00:00Z",
"2021-01-01T00:00:00Z"
],
"cog_default": {
"avg": 287.90577560637,
"max": 478.89999389648,
"min": 51
}
},
"properties": {
"dashboard:is_periodic": true,
"dashboard:time_density": "year"
}
} |
@abarciauskas-bgse thanks digging in to this! To unblock UI development we did just settle on a few solutions that we could commit to deliver for the dashboard UI. I don't know that it too late to make changes but at this point it will impact the front end so we'd have to coordinate to not break anything.
We did discuss other keys but since this dashboard extension is purely for the front end, UI got final vote on preferences. --
Yeah, it is for rescaling parameters. This is an incremental solution that only supports simple products with single band COG assets. I think it needs to be somewhat specific but other asset keys might fit better (even just
--
We have a user defined function for pgstac--the evolution of the function is issue #31 and I am working on adding that function to our deployment in [PR 34](#43. It is not a perfect solution but the goal is to make a simple function call that could be the terminal step in an ingest pipeline (maybe a fan-in to a single pgstac function call that will dynamically create the summary). We're also creating an update all method that will update any collection that has the necessary dashboard metadata attributes that we might want to schedule to update regularly. It would be preferable if we could identify trigger events to run when needed for a given collection. The latest iteration nixes the average because the way it is derived is not useful (min of mins is a valid metric; average of means is less so). --
For now we have committed to maintaining this information in one place for the dashboard UI. But we intend to make the creating and updating of summaries hands-off. --
Yes but we will have a function to automatically generate the summaries for all of our non-spectral datasets if there is an item_assets property on the collection. Totally open to discussing that but for now the SQL routine looks at the item assets property to decide whether or not to create a cog_default summary, if not it will only create a datetime summary. And one nit: the license should be one of the predefined SPDX licenses because stac browsers will link to the the spdx license on an id lookup. But this does not impact any of our features so it's the kind of thing we'll probably want to circle back on when we have easier ways to edit the metadata. |
Thanks so much for all these detailed responses and apologies for my belated comments and that you may have had to repeat any information I should have been aware of. You obviously have thought through this solution comprehensively and developed some really cool functionality. I'm more than happy to implement the conventions as defined above and seek your review of all the metadata moving forward 🙇🏽♀️ |
I also updated https://j2wlly6xg8.execute-api.us-east-1.amazonaws.com/collections/OMSO2PCA and the example above with the "MIT" as the license, given that's what you used for the other datasets. |
Decision about id's and titles:
@anayeaye I'm inclined to keep title and ID the same but do you know if there is a good use case where they should be different? Like dataset landing pages where the title might be a more descriptive name? |
|
@anayeaye a small nit, should we add |
I think my takeaways with respect to
@anayeaye what do you think with that summary ⬆️ |
Question on summaries: if we're going to implement summaries in a terminal step, should we be adding them at all right now while we are creating the collections? I'm going to leave summaries out for now, assuming we can run the summaries function after ingest. |
One more thought about naming: s3 data directories should match the ids of the datasets. It would be nice to enforce this in the future but for now just stating it for the group. |
@jvntf - totally missed hour. Definitely adding that in the edit. |
Creative Commons Zero licensing recommendation Still true: choose a SPDX license id license id or use Snippets/links from discussion with about choosing a license
|
Thanks for recording this here, @anayeaye. I take it from our discussion on Slack that the decision was made in favor of CC0-1.0. Where do we need to document this? |
@j08lue probably edit the first post in this thread and specify it there |
Done. |
These docs are now published at https://nasa-impact.github.io/veda-docs/contributing/dataset-ingestion/stac-collection-conventions.html Perhaps this issue can now be closed and we in the future maintain this information in the docs site? |
I'm going to close and lock this issue. |
Dashboard-specific notes that supplement the full stac-api collection specification. Note that there is no schema enforcement on the collection table content in pgstac—this provides flexibility but also requires caution when creating and modifying Collections.
Collection field, extension, and naming recommendations
True/False
This boolean is used when summarizing the collection—if the collection is periodic, the temporal range of the items in the collection and the time density are all the front end needs to generate a time picker. If the items in the collection are not periodic, a complete list of the unique item datetimes is needed.year
,month
,day
,hour
,minute
, ornull
. These time steps should be treated as enum when the extension is formalized. For collections with a single time snapshot this value is null.omi-trno2-dhrm
andomi-trno2-dhrm-difference
vsno2-monthly
andno2-monthly-diff
;bmhd-30m-monthly
vsnightlights-hd-monthly
CC0-1.0
item_assets example
summaries example for periodic collection
summaries example for non-periodic collection
The text was updated successfully, but these errors were encountered: