Skip to content
This repository has been archived by the owner on Sep 26, 2023. It is now read-only.

Add new VEDA dataset: Global TWS non-stationarity index #224

Closed
5 of 9 tasks
dfelikson opened this issue Nov 9, 2022 · 20 comments · Fixed by #240
Closed
5 of 9 tasks

Add new VEDA dataset: Global TWS non-stationarity index #224

dfelikson opened this issue Nov 9, 2022 · 20 comments · Fixed by #240
Assignees
Labels

Comments

@dfelikson
Copy link

dfelikson commented Nov 9, 2022

  • Identify the point of contact and ensure someone is providing them updates: @dfelikson and @j08lue

  • Identify data location s3://veda-data-store-staging/EIS/Global_TWS_data/DATWS_nonstationarity_index_v2.cog.tif

  • Number of items: 1

  • Verify data is valid COG

  • Gather STAC collection metadata

    • id: lis-tws-nonstationarity-index
    • title: Global TWS non-stationarity index
    • description: The global Terrestrial Water Storage (TWS) non-stationarity index integrates the trend, seasonal shifts, and variability change of TWS for the period of 2003 - 2020. TWS is derived by jointly assimilating the MODIS Leaf Area Index, the ESACCI surface soil moisture, and the GSFC GRACE mascon-based TWS anomalies into the Noah-MP land surface model within the NASA Land Information System (LIS) at 10 km spatial resolution forced by the combination of MERRA2 and IMERG meteorological fields. The smaller the non-stationarity index is, the more the water cycle is under a non-stationary process. Glaciers and Greenland are excluded from the analysis.
    • license: Creative Commons Zero (CC0-1.0)
    • temporal interval: 2003 - 2020
    • whether it is periodic on the dashboard (periodic = regular time series of layers without gaps): false
    • the dashboard time density: none
  • Review and follow https://github.com/NASA-IMPACT/cloud-optimized-data-pipelines/blob/main/OPERATING.md

  • Open PR for publishing those datasets to the Staging API:

  • Notify QA / move ticket to QA state

  • Once approved, merge and close.

Resources about metadata

@dfelikson
Copy link
Author

dfelikson commented Nov 9, 2022

The COG file was renamed to *.cog.tif.

@dfelikson
Copy link
Author

I tried verifying the validity using rio cogeo validate but got a warning:

/srv/conda/envs/notebook/lib/python3.9/site-packages/rasterio/__init__.py:277: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
/home/jovyan/DATWS_nonstationarity_index.cog.tif is a valid cloud optimized GeoTIFF

@j08lue - is the missing geotransform going to be a problem? I'm guessing it will be but wanted to double check.

@anayeaye
Copy link
Contributor

anayeaye commented Nov 9, 2022

@dfelikson good catch. It looks like this missing geotransform is going to be a problem for some visualization tools:
https://staging-raster.delta-backend.com/cog/statistics?url=s3://veda-data-store-staging/EIS/Global_TWS_data/DATWS_nonstationarity_index.cog.tif

{"detail":"The transformation is already \"north up\" or a transformation between pixel/line and georeferenced coordinates cannot be computed for /vsis3/veda-data-store-staging/EIS/Global_TWS_data/DATWS_nonstationarity_index.cog.tif. There is no affine transformation and no GCPs. Specify transformation option SRC_METHOD=NO_GEOTRANSFORM to bypass this check."}

@j08lue
Copy link
Contributor

j08lue commented Nov 9, 2022

Definitely a problem, @dfelikson. The geotransform is needed to know where to place your data on the map and how it stretches across it.

The transform basically defines where in the projected coordinates the dataset starts (its upper-left corner or so) and how high and wide the pixels are. Without that, most tools will just place the dataset at the origin of the coordinate reference system (0,0) and use the identity matrix - pixels being 1 unit high and wide.

You could try to add the transform with rio edit_info --transform: https://rasterio.readthedocs.io/en/latest/api/rasterio.rio.edit_info.html

This page here explains what an affine transform is - basically which of the six numbers defines what property: https://www.perrygeo.com/python-affine-transforms.html

But regenerating the file and preserving the original transform is probably safer.

@dfelikson
Copy link
Author

@j08lue - I uploaded an updated dataset to the same S3 location (s3://veda-data-store-staging/EIS/Global_TWS_data/DATWS_nonstationarity_index.cog.tif) with updated spatial coordinates. Please have a look and let me know if this works.

@j08lue
Copy link
Contributor

j08lue commented Nov 10, 2022

@dfelikson
Copy link
Author

Ah ha! Thanks for the tip with that URL. I think it was an issue with AWS not letting me overwrite the previous version of the file. I uploaded a new file (s3://veda-data-store-staging/EIS/Global_TWS_data/DATWS_nonstationarity_index_v2.cog.tif) and it passed the URL test.

@j08lue
Copy link
Contributor

j08lue commented Nov 11, 2022

@dfelikson Cool, thanks. For visual QA (always a good idea so you can spot resampling issues - e.g. when you chose bilinear interpolation for overviews on a categorical dataset), you can also use https://staging-raster.delta-backend.com/cog/viewer?url=s3://veda-data-store-staging/EIS/Global_TWS_data/DATWS_nonstationarity_index_v2.cog.tif

And it looks as expected for this coarse-resolution data:

image

image

So all is good.

@j08lue j08lue added this to the EIS Coastal Risk discovery milestone Nov 14, 2022
@dfelikson
Copy link
Author

I've added all necessary STAC metadata. Let me know if there's anything else needed before this dataset can be ingested.

@abarciauskas-bgse
Copy link
Collaborator

@dfelikson 👋🏽 I am going to help finish ingesting this dataset and just wanted to double check:

I've added all necessary STAC metadata.

Do you just mean you added the STAC metadata content to this issue in the first comment or added it someplace else? I don't see this dataset in the STAC API (and was assuming that is what I could help with) but wanted to double check.

@dfelikson
Copy link
Author

Hi @abarciauskas-bgse! Thanks for your help. Yes, I meant I just added it to the first comment. Let me know if I need to do something else or if you need more info.

@abarciauskas-bgse
Copy link
Collaborator

@dfelikson thanks - also I noticed the other LIS TWS collections are using the MIT license - should they all be using the same license and should it be MIT or Creative Commons Zero (CC0-1.0)

@dfelikson
Copy link
Author

I'm not 100% sure but I don't really think we have a preference. Can this be updated later?

@abarciauskas-bgse abarciauskas-bgse self-assigned this Nov 17, 2022
@abarciauskas-bgse
Copy link
Collaborator

abarciauskas-bgse commented Nov 17, 2022

@dfelikson yes no problem - I have ingested the collection and item to our DEV API and database - https://dev-stac.delta-backend.com/collections/lis-tws-nonstationarity-index - do you want to take a look and let me know it looks ok? Then I will ingest to the staging DB

(Btw, I need to find out why the summaries only shows the start date and if that's a problem)

@dfelikson
Copy link
Author

@abarciauskas-bgse - looks good to me on first glance! Now that it's in the API and database, can I try adding it to the discovery?

@abarciauskas-bgse
Copy link
Collaborator

Shortly! I'm in meetings for a bit but will add it to the staging database in the next few hours

@dfelikson
Copy link
Author

No worries at all! Thanks, @abarciauskas-bgse - still learning how this process works.

@abarciauskas-bgse
Copy link
Collaborator

@dfelikson ok this has been inserted into staging and you should be able to configure the discovery with it https://staging-stac.delta-backend.com/collections/lis-tws-nonstationarity-index

@dfelikson
Copy link
Author

I need a little help, @abarciauskas-bgse. Here's how I configured it in the Discovery:

<Block type='full'>
  <Figure>
    <Map
      datasetId='lis-tws-nonstationarity-index'
      layerId='lis-tws-nonstationarity-index'
      dateTime='2003-01-01'
      zoom={1}
      center={[0,0]}
    />
    <Caption 
      attrAuthor='NASA' 
      attrUrl='https://nasa.gov/'
    />
  </Figure>
 </Block>

But I must have either the datasetID, layerID, or dateTime configured wrong because it's not showing up. I took a look through the metadata https://staging-stac.delta-backend.com/collections/lis-tws-nonstationarity-index and I see an id but I don't see different ids for the dataset vs layer. Is there another endpoint that will show me that?

@j08lue
Copy link
Contributor

j08lue commented Nov 18, 2022

The dataset and layers need to be set up in delta-config/datasets first. We need a PR against delta-config for that, like NASA-IMPACT/veda-config#143.

https://github.com/NASA-IMPACT/delta-config/blob/develop/docs/CONTENT.md#datasets

You would be welcome to open that, @dfelikson.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants