-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Storing GCPs in xarray #376
Comments
Sorry for the late reaction, early vacation here in Sweden. The TL;DR summary: the JSON structure you propose is the easiest and most pragmatic solution. The idea I had was to use the existing coordinates mechanism that xarray provides, like this: import dask.array as da
import xarray as xr
import numpy as np
data = xr.DataArray(da.ones((40, 40)), coords=dict(y=range(40), x=range(40)), dims=['y', 'x'])
lons = da.random.random((10, 10))
lats = da.random.random((10, 10))
z = da.random.random((10, 10))
x_sub = np.arange(0, 40, 4)
y_sub = np.arange(0, 40, 4)
xrlons = xr.DataArray(lons, dims=['y', 'x'], coords=dict(y=y_sub, x=x_sub))
data.coords['lons'] = xrlons
data that prints out:
as xarray fills up the non-existing x and y points in lons and fills it with nans. From there, we can fill the nans using linear interpolation with the builtin data.coords['lons'].interpolate_na('y', fill_value="extrapolate").interpolate_na('x', fill_value="extrapolate").values However there are a some issues:
So if we don't want to go to the risky business of guessing what the user want or provide a bloated interface to handle all the particularities of different data formats, I think your JSON proposal is the best for now. |
Sounds good to me. @mraspaud is this something you would be interested in submitting a PR for? |
I'm under quite some time pressure right now, so I can't make any promises... I'll report here if I start working on it. |
Just for reference, I was just wondering how it would look like as geojson: { "type": "FeatureCollection",
"features": [
{ "type": "Feature",
"geometry": {"type": "Point", "coordinates": [-1.3712442422120068, 48.818122749035496, 1.9444833537563682]},
"properties": {"id": "1",
"col": 0,
"row": 0,
"info": "the first gcp"}
},
{ "type": "Feature",
"geometry": {"type": "Point", "coordinates": [103.0, 1.5, 12.0]},
"properties": {"id": "2",
"col": 5,
"row": 0,
"info": "the second gcp"}
}
...
]
} So more verbose, but I don't know if that would be useful in any way. |
I like the idea of storing it using a standard format 👍 |
Is it worth adding the geojson package as a dependency? |
My initial impression is no because the geometry types are simple and consistent. However, it isn't a large package, so if it has a sizeable reduction/simplification of the code then it is probably fine. One thing to be careful about is the precision enforced by geojson. |
Actually, the precision enforcement from geojson seems to be unavoidable. You can set the precision, but that seems kinda clunky. I am currently thinking we shouldn't use it for GCPs. Thoughts? |
You are right about geojson fiddling with the precision. I'm supposing we will not dump/load the geojson string, as we use this as an internal representation of the gcps only, so I would rather keep the full precision. Hence we might need to implement our own geojson conversion, but it should be small. Relevant part of the geojson spec: I'll be opening a draft PR shortly so we can discuss implementation details there. |
Forward from: #339
Changes that will likely be needed:
rio.write_gcps()
- write an array of GCPS to thegrid_mapping
coordinate attrs. I am thinking json would be the best way to go.rio.get_gcps()
- read GCPS from json in thegrid_mapping
coordinate attrs and convert to rasterio GCP objects.rio.reproject()
- pass in GCPS automatically where applicable.open_rasterio()
- write GCPS and don't generate coordinates or transform.rio.to_raster()
- write GCPS to raster.Original GCPs:
Coordinate Structure
JSON Structure
Grid mapping coordinate:
cc @mraspaud
The text was updated successfully, but these errors were encountered: