Skip to content
This repository has been archived by the owner on Sep 26, 2023. It is now read-only.

Add CMIP6 downscaled data to the STAC API #143

Closed
3 tasks
abarciauskas-bgse opened this issue Jun 11, 2022 · 1 comment
Closed
3 tasks

Add CMIP6 downscaled data to the STAC API #143

abarciauskas-bgse opened this issue Jun 11, 2022 · 1 comment
Assignees

Comments

@abarciauskas-bgse
Copy link
Collaborator

Description

From @leothomas :
The CMPI6 dataset spans a massive temporal range (1950 - 2100) compared to most other datasets (~2000 ~2030), so it required a fundamental shift in the way PGSTAC (the database) is organized. See below for a more detailed description of the required change.

Alex was working on integrating the PGSTAC v0.5 to the VEDA backend ton enable ingestion of the CMIP6 dataset, but the very modification that enable ingesting CMIP6 imply having to update/re-ingest all of the other datasets.
CMIP6 alone is pretty massive (3.6 Tb), so we were holding off on re-ingesting everything until we have deployed the backend to MCP (since we'll have to re-ingest everything there anyways).

The COGs have been generated for the CMIP6 dataset, so we will just have to generate STAC records for them and copy them over

Background info on why we need to stand up a new database in order to integrate the changes necessary to ingest CMIP6:
PGSTAC has a partition on commonly searched fields (date, geometry, etc) to make the database more performant.
Up until v0.5, PGSTAC organized everything into weekly partitions, which was great, since most datasets had a temporal range of no more than ~20 years (20 * 52 = 1040 partitions). Along comes CMIP6 with a 150 year temporal range, which forces the database to create 150 * 52 = 7800 partitions, which crashes the database.

To solve this, David Bitner did a huge refactor of the database to allow partitioning first on collections, and then optionally on time, with the option to partition by year or month.

Once partitions have been created in a database, you can't really delete them, so the solution is to deploy a new database, and then re-ingest everything.

Tasks

  • Upgrade database in lower level environment
  • Test adding CMIP6 to new database + test other datasets still work as expected

Acceptance criteria

CMIP6 data available in the staging API

Subsequent tickets

  • CMIP story in new dashboard
@leothomas
Copy link

CMIP6 is being tracked here: #191

This issue can be closed.

@j08lue j08lue closed this as completed May 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants