Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behaviour when modifying coords with assign_coords #825

Open
patel-zeel opened this issue Nov 6, 2024 · 5 comments
Open

Unexpected behaviour when modifying coords with assign_coords #825

patel-zeel opened this issue Nov 6, 2024 · 5 comments
Labels
question Further information is requested

Comments

@patel-zeel
Copy link

Code Sample, a copy-pastable example if possible

import rioxarray as rxr

da = rxr.open_rasterio("https://huggingface.co/datasets/Zeel/tmp/resolve/main/1524-1184.tif")
# save locally
da.rio.to_raster("tmp.tif")

# Load
da = rxr.open_rasterio("tmp.tif")
print("Original values", da.x.values[:5])
da = da.assign_coords(x = np.round(da.x, 2))
print("Modified values before saving", da.x.values[:5])

# Save
da.rio.to_raster("tmp2.tif")

# Reload
da = rxr.open_rasterio("tmp2.tif")
print("Modified values after saving and reloading", da.x.values[:5])

Output

Original values [9783942.00780707 9783946.78512134 9783951.56243561 9783956.33974987
 9783961.11706414]
Modified values before saving [9783942.01 9783946.79 9783951.56 9783956.34 9783961.12]
Modified values after saving and reloading [9783942.01       9783946.7873138  9783951.5646276  9783956.34194139
 9783961.11925519]

Expected Output

Original values [9783942.00780707 9783946.78512134 9783951.56243561 9783956.33974987
 9783961.11706414]
Modified values before saving [9783942.01 9783946.79 9783951.56 9783956.34 9783961.12]
Modified values after saving and reloading [9783942.01 9783946.79 9783951.56 9783956.34 9783961.12]

Environment Information

Installed fresh in Google colab with pip install rioxarray

Question

If this is not a recommended way to modify the coordinates, please help me with the recommended way.

@patel-zeel patel-zeel added the bug Something isn't working label Nov 6, 2024
@snowman2 snowman2 added the question Further information is requested label Nov 6, 2024
@snowman2
Copy link
Member

snowman2 commented Nov 6, 2024

With the change in coordinates, your dx/dy are no longer evenly spaced:

da = rioxarray.open_rasterio("tmp.tif")
print("Original values", da.x.values[:5])
print("DX", da.x.values[:5]-da.x.values[1:6])
da = da.assign_coords(x = numpy.round(da.x, 2))
print("Modified values before saving", da.x.values[:5])
print("DX", da.x.values[:5]-da.x.values[1:6])
Original values [9783942.00780707 9783946.78512134 9783951.56243561 9783956.33974987
 9783961.11706414]
DX [-4.77731427 -4.77731427 -4.77731427 -4.77731427 -4.77731427]
Modified values before saving [9783942.01 9783946.79 9783951.56 9783956.34 9783961.12]
DX [-4.78 -4.77 -4.78 -4.78 -4.77]

After saving the raster, the new coords are again evenly spaced:

# Save
da.rio.to_raster("tmp2.tif")
# Reload
da = rioxarray.open_rasterio("tmp2.tif")
print("Modified values after saving and reloading", da.x.values[:5])
print("DX", da.x.values[:5]-da.x.values[1:6])
Modified values after saving and reloading [9783942.01       9783946.7873138  9783951.5646276  9783956.34194139
 9783961.11925519]
DX [-4.7773138 -4.7773138 -4.7773138 -4.7773138 -4.7773138]

@snowman2 snowman2 removed the bug Something isn't working label Nov 6, 2024
@patel-zeel
Copy link
Author

patel-zeel commented Nov 7, 2024

Thank you for the response, @snowman2! Now, I understand what's going on. Actually, my use case is like the following:

  • I am trying to merge multiple tif files with xr.open_mfdataset. Their coordinates are similar, but floating-point precision results in a non-monotonic final index. So, I wanted to round the coordinates to make sure all similar coordinates become exactly the same. Is there a better way to achieve the same instead of what I have done above?

@snowman2
Copy link
Member

snowman2 commented Nov 7, 2024

Is there a better way to achieve the same instead of what I have done above?

I recommend referring to

rioxarray/rioxarray/_io.py

Lines 848 to 891 in fa35e91

def _load_bands_as_variables(
riods: RasterioReader,
*,
parse_coordinates: bool,
chunks: Optional[Union[int, tuple, dict]],
cache: Optional[bool],
lock: Any,
masked: bool,
mask_and_scale: bool,
decode_times: bool,
decode_timedelta: Optional[bool],
vrt_params: Optional[dict],
**open_kwargs,
) -> Union[Dataset, list[Dataset]]:
"""
Load in rasterio bands as variables
"""
global_tags = _parse_tags(riods.tags())
data_vars = {}
for band in riods.indexes:
band_riods = SingleBandDatasetReader(
riods=riods,
bidx=band - 1,
vrt_params=vrt_params,
)
band_name = f"band_{band}"
data_vars[band_name] = (
open_rasterio( # type: ignore
band_riods,
parse_coordinates=band == 1 and parse_coordinates,
chunks=chunks,
cache=cache,
lock=lock,
masked=masked,
mask_and_scale=mask_and_scale,
default_name=band_name,
decode_times=decode_times,
decode_timedelta=decode_timedelta,
**open_kwargs,
)
.squeeze() # type: ignore
.drop_vars("band") # type: ignore
)
dataset = Dataset(data_vars, attrs=global_tags)
.

In that code, it only adds coordinates for one of the data arrays and then the other data arrays in the dataset inherit the coordinates.

@patel-zeel
Copy link
Author

Thank you for the reference, @snowman2, but I couldn't fully understand what you are trying to convey. I have multiple clusters of files where, in each cluster, coordinates are very similar, with just floating point differences. I'd appreciate a lot if you could provide a code/pseudo-code of how to achieve this.

@rbavery
Copy link

rbavery commented Nov 7, 2024

this sounds related to https://discourse.pangeo.io/t/example-which-highlights-the-limitations-of-netcdf-style-coordinates-for-large-geospatial-rasters/4140

there's a draft pr in xarray with suggestions on how rioxarray could be changed to not materialize coordinates and introduce floating point imprecision pydata/xarray#9543

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants