-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot pickle '_thread.lock' object exception after DataArray transpose and copy operations from netCDF file. #8442
Comments
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! |
Here is a locally reproducible MCVE. import xarray as xr
import numpy as np
file_path = "test.nc"
ds = xr.Dataset(
{
'latitude': np.arange(10),
'longitude': np.arange(10),
'precip': (['latitude', 'longitude'], np.arange(100).reshape(10, 10))
}
)
ds.to_netcdf(file_path, engine="h5netcdf")
ds = xr.open_dataset(file_path, engine="h5netcdf", decode_coords=True, decode_times=True)
da = ds["precip"]
da = da.transpose("longitude", "latitude", missing_dims="ignore")
da = da.copy() Note that if |
Hmm, I don't get an error there. Can you post your dependencies? (Instructions in the bug report template) Edit: though it seems to rely on the file being there from #8443... |
Apologies, I had not written the netCDF file out in the MCVE 🤦♂️, the example is updated now. I was able to produce the error in the environment below.
|
Mine too succeeds with PS: the code I run has |
🤔 I upgraded to
I'm going to ask a few colleagues to try and replicate to see if this is something peculiar to my environment. |
My colleague was also able to reproduce the exception as well with the ☝️
INSTALLED VERSIONS
------------------
commit: None
python: 3.9.18 (main, Nov 2 2023, 16:51:22)
[Clang 14.0.3 (clang-1403.0.22.14.1)]
python-bits: 64
OS: Darwin
OS-release: 23.1.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.0
I'm unsure what the differences in environments could be 🤔 .
xarray: 2023.10.1 |
That's a puzzle... Can we reproduce it in a binder / in a test? |
No idea if it has the same underlying cause (I'm not transposing but am copying), but I do have a situation that used to work but now1 gives this same Edit: here's a little example3 experimenting with import xarray as xr
from joblib import dump
ds = xr.tutorial.load_dataset("air_temperature").isel(time=slice(4))
ds.to_netcdf("ds.nc", engine="netcdf4")
dump(ds, "ds.joblib") # 0. Succeeds
ds.close()
# 1. Try to pickle the whole Dataset
ds = xr.open_dataset("ds.nc")
dump(ds, "ds.joblib") # TypeError: cannot pickle '_thread.lock' object
# 2. Try to pickle a DataArray
ds = xr.open_dataset("ds.nc")
dump(ds.air, "ds.air.joblib") # TypeError: cannot pickle '_thread.lock' object
# 3. Somehow adding a new variable makes it okay to pickle `ds.air` (and `ds` if `.copy()` applied)
ds = xr.open_dataset("ds.nc")
ds["b"] = xr.zeros_like(ds.air)
dump(ds.air, "ds.air.joblib") # Succeeds
dump(ds, "ds.joblib") # But this still fails
dump(ds.copy(), "ds.joblib") # Succeeds Versions
Also tried in an env with HDF5 1.14.3, it didn't help. Footnotes
|
I was able to reproduce the error in OP's above example in a fresh env. Similar to one of my experiments, the error is, for me, averted if you add a new variable to the Dataset (e.g. |
@zmoon Thanks for this MCVE! I can't reproduce the error, though. Also the MCVE in #8442 (comment) works nicely (details below). Does it still fail Versions
|
OK, here we go, I've taken Versions
|
@kmuehlbauer I experienced the error on Windows as well as WSL. I tried a fresh env on Linux and still got the error 🤷 Versions
Edit: From above OP also didn't have Dask. Adding |
There has been some refactoring lately involving dask and other ChunkManagers. Not sure, if this has anything to do with it, but maybe @TomNicholas has more insight here. |
I don't really see why this should have anything to do with it... I guess it's not impossible that somehow some dask EDIT: But you're saying you can reproduce this without dask anyway @kmuehlbauer ? |
Yes, thanks @TomNicholas for looking into this. Will try to bisect this. |
@zmoon @sharkinsspatial Did this ever work for you? I've a hard time finding a working commit. I've checked several versions back to 0.17.0 without success. Also the other involved dependencies (hdf5, netcdf-c, netcdf4-python, h5py, pandas) would be good to know to recreate a working environment. |
@kmuehlbauer for me I don't have the environment anymore, but I suspect I probably had dask installed in it and that's why it was working. |
TL;DR:The current default of Inspection:Using the MCVE given here #8442 (comment) I checked the types of the underlying array and how this works for transposing or not:
Reading with
and further:
There is also a mention for Pickle: https://docs.xarray.dev/en/stable/user-guide/io.html#pickle
What to do?The pickle issue might not be the big problem as the user is advised to load/compute before. But the copy-issue should be resolved somehow. Unfortunately I do not have an immediate solution to this. @pydata/xarray any ideas? |
(brief message to say thanks a lot @kmuehlbauer for the excellent summary) |
I believe the issue are these two default locks for HDF5 and NetCDFC: xarray/xarray/backends/locks.py Line 18 in 2971994
Probably the easiest way to handle this is to fork the code for SerializableLock from dask. It isn't very complicated: |
Thanks @shoyer! |
What is your issue?
I hit this issue while using rioxarray with a series of operations similar to those noted in this issue corteva/rioxarray#614. After looking through the
rioxarray
codebase a bit I was able to reproduce the issue with purexarray
operations.If the Dataset is opened with the default
lock=True
settings, transposing a DataArray's coordinates and then copying the DataArray results in acannot pickle '_thread.lock' object
exception.If the Dataset is opened with
lock=False
, no error is thrown.This sample notebook reproduces the error.
This might be user error on my part, but it would be great to have some clarification on why
lock=False
is necessary here as my understanding was that this should only be necessary when using parallel write operations.The text was updated successfully, but these errors were encountered: