Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On-the-fly endianness conversion #125

Merged
merged 5 commits into from
Nov 12, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ What's new

Bug fixes
~~~~~~~~~
- The introduction of `sparse`, with `numba` under the hood, restricted input data to little-endian dtypes. In those cases, xESMF switches back to using scipy (:pull:`125`). By `Pascal Bourgault <https://github.com/aulemahal>`_
- Regridding datasets with dask-backed variables is fixed. Dtype of the outputs is changed for this specific case. By `Pascal Bourgault <https://github.com/aulemahal>`_
- SpatialAverager did not compute the same weights as Regridder when source cell areas were not uniform (:pull:`128`). By `David Huard <https://github.com/huard>`_


0.6.1 (23-09-2021)
------------------

Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ extend-ignore = E203,E501,E402,W605

[isort]
known_first_party=xesmf
known_third_party=ESMF,cf_xarray,cftime,dask,numpy,pytest,setuptools,shapely,sparse,xarray
known_third_party=ESMF,cf_xarray,cftime,dask,numba,numpy,pytest,setuptools,shapely,sparse,xarray
multi_line_output=3
include_trailing_comma=True
force_grid_wrap=0
Expand Down
10 changes: 10 additions & 0 deletions xesmf/smm.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import warnings
from pathlib import Path

import numba as nb
import numpy as np
import sparse as sps
import xarray as xr
Expand Down Expand Up @@ -120,6 +121,15 @@ def apply_weights(weights, indata, shape_in, shape_out):
Extra dimensions are the same as `indata`.
If input data is C-ordered, output will also be C-ordered.
"""
# Limitation from numba : some big-endian dtypes are not supported.
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how costly is this test? wouldn't it be cheaper to just test with:

if indata.dtype.byteorder == '>'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's very costly. My intuition comes from the fact that it happens when numba doesn't support the given dtype, so no computation is done at all. But indeed, it is usually less costly to perform a single check than a try-except call, that's from pure python.

However, I was trying to be as general as I can be here. This change was triggered by numba not supporting a given byte-order, but could there be other things that numba doesn't support, especially on other machines? Evidently, I can't test those with such a computer, but I tried to be ready for other eventualities...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point, I'm ok with sacrificing a bit of performance for compatibility. try/except it is then

nb.from_dtype(indata.dtype)
nb.from_dtype(weights.dtype)
except NotImplementedError:
warnings.warn(
'Input array has a dtype not supported by sparse and numba. Falling back to scipy.'
)
weights = weights.to_scipy_sparse()

# COO matrix is fast with F-ordered array but slow with C-array, so we
# take in a C-ordered and then transpose)
Expand Down
13 changes: 13 additions & 0 deletions xesmf/tests/test_frontend.py
Original file line number Diff line number Diff line change
Expand Up @@ -411,6 +411,19 @@ def test_regrid_dataarray(use_cfxr):
xr.testing.assert_identical(dr_out, dr_out_rn)


def test_regrid_dataarray_endianess():
# xarray.DataArray containing in-memory numpy array
regridder = xe.Regridder(ds_in, ds_out, 'conservative')

exp = regridder(ds_in['data']) # Normal (little-endian)
with pytest.warns(UserWarning, match='Input array has a dtype not supported'):
out = regridder(ds_in['data'].astype('>f8')) # big endian

# Results should be the same
assert_equal(exp.values, out.values)
assert out.dtype == '>f8'


def test_regrid_dataarray_to_locstream():
# xarray.DataArray containing in-memory numpy array

Expand Down