- Title: xarray Assets
- Identifier: https://stac-extensions.github.io/xarray-assets/v1.0.0/schema.json
- Field Name Prefix: xarray
- Scope: Asset
- Extension Maturity Classification: Pilot
- Owner: @TomAugspurger
This document explains the xarray Assets Extension to the SpatioTemporal Asset Catalog (STAC) specification.
This extension helps users open STAC Assets with xarray. It gives a place for catalog maintainers to specify various required or recommended options. Without this extension, users would somehow need to know which options are required in order to load the dataset. See Python Example for an example of how consumers of this extension can use it to simplify data loading.
- Examples:
- Item example: Shows the basic usage of the extension in a STAC Item
- Collection example: Shows the basic usage of the extension in a STAC Collection
- JSON Schema
- Changelog
Field Name | Type | Description |
---|---|---|
xarray:open_kwargs | Map<string, Any> | Keyword arguments to provide to the xarray opener |
xarray:storage_options | Map<string, Any> | Additional keywords to provide to fsspec.filesystem |
Keyword arguments to provide to the xarray opener, for example xarray.open_zarr
.
The opener should be determined by the media type of the asset.
The are in addition to the positional argument, for example the store
, which is obtained from the Assets href
. For example, to specify consolidated metadata:
{
"xarray:open_kwargs": {
"consolidated": true
}
}
fsspec.filesystem
enables opening a filesystem from URI (e.g. abfs://path/to/blob
, https://path/to/file
, s3://path/to/file
). The various filesystems support and require backend-specific keyword arguments, which can be provided as **storage_options
.
This example demonstrates how consumers of this extension can use the data to simplify the process of loading an asset from STAC into an xarray Dataset.
>>> import pystac, planetary_computer, xarray as xr
>>> collection = planetary_computer.sign(
... pystac.read_file("https://planetarycomputer.microsoft.com/api/stac/v1/collections/terraclimate")
... )
>>> asset = collection.assets["zarr-abfs"]
>>> asset.media_type
>>> ds = xr.open_dataset(
... asset.href,
... **asset.extra_fields["xarray:open_kwargs"]
... )
>>> ds
<xarray.Dataset> Size: 2TB
Dimensions: (time: 768, lat: 4320, lon: 8640, crs: 1)
Coordinates:
* crs (crs) int16 2B 3
* lat (lat) float64 35kB 89.98 89.94 89.9 89.85 ... -89.9 -89.94 -89.98
* lon (lon) float64 69kB -180.0 -179.9 -179.9 ... 179.9 179.9 180.0
* time (time) datetime64[ns] 6kB 1958-01-01 1958-02-01 ... 2021-12-01
Data variables: (12/14)
aet (time, lat, lon) float32 115GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
def (time, lat, lon) float32 115GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
pdsi (time, lat, lon) float32 115GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
pet (time, lat, lon) float32 115GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
ppt (time, lat, lon) float64 229GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
q (time, lat, lon) float64 229GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
... ...
swe (time, lat, lon) float64 229GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
tmax (time, lat, lon) float32 115GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
tmin (time, lat, lon) float32 115GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
vap (time, lat, lon) float32 115GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
vpd (time, lat, lon) float32 115GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
ws (time, lat, lon) float32 115GB dask.array<chunksize=(12, 1024, 1024), meta=np.ndarray>
Attributes: (12/52)
Conventions: CF-1.6
acknowledgment: Please cite the references included here...
cdm_data_type: GRID
contributor_email: khegewisch@ucmerced.edu
contributor_name: Katherine Hegewisch
contributor_role: Postdoctoral Fellow
... ...
time_coverage_duration: P1Y
time_coverage_end: 1958-12-01T00:0
time_coverage_resolution: P1M
time_coverage_start: 1958-01-01T00:0
title: TerraClimate: monthly climate and climat...
version: v1.0
All contributions are subject to the STAC Specification Code of Conduct. For contributions, please follow the STAC specification contributing guide Instructions for running tests are copied here for convenience.
The same checks that run as checks on PR's are part of the repository and can be run locally to verify that changes are valid.
To run tests locally, you'll need npm
, which is a standard part of any node.js installation.
First you'll need to install everything with npm once. Just navigate to the root of this repository and on your command line run:
npm install
Then to check markdown formatting and test the examples against the JSON schema, you can run:
npm test
This will spit out the same texts that you see online, and you can then go and fix your markdown or examples.
If the tests reveal formatting problems with the examples, you can fix them with:
npm run format-examples