You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
over at mlptrain we're trying to update our autode version. Things were going smoothly until we noticed that since autode v1.3.4 we could no longer load certain .npz files, which are used in mlptrain to store the training data. Specifically, we're getting this confusing stacktrace:
This is only happening for some older npz files. After some tedious debugging with @juraskov, we've identified the following conditions for the problematic npz files:
only ConfigurationSet classes that contained Configurations with different number of atoms (e.g. differently solvated clusters)
The __setstate__ function which raises the exception extends the pickling protocol for autode.values.ValueArray (which subclasses np.ndarray) and has been introduced as part of #215 to solve issue described in #221 (see also standalone PR #222).
It seems like the following patch solves the loading issue, insofar as access to the underlying ndarrays is concerned (the extra attributes are lost, but the underlying data are more important to recover).
It's not a pretty solution, but feels reasonably safe, since we would be basically falling back to the original __setstate__ implementation. Happy to hear better ideas! Also happy to submit a PR.
Unfortunately, on the mlptrain side the issue cannot be ignored since it affects a fair amount of existing data.
The text was updated successfully, but these errors were encountered:
danielhollas
changed the title
Custom pickling in autode breaks loading of npz files in mlptrain
Custom pickling in autode breaks loading of old npz files in mlptrain
Feb 4, 2025
I think this would work. Also wondering if it might be possible to check if the last item of state is a dict and whether that would be more safer than a try...except.
Also, does losing the extra attributes cause any issues?
Also wondering if it might be possible to check if the last item of state is a dict and whether that would be more safer than a try...except.
I think try...except is safer/easiest since the last item doesn't need to be a dict, I think any Mapping type would work? I also like the explicitness of the try..except, since this situation should be exceptional. But I guess both would work.
Also, does losing the extra attributes cause any issues?
yeah, trying to access the extra attributes (even just calling vars on the object) might trigger an exception. So it's not ideal, but the important part is that one can access the underlying ndarray data.
I think it would definitely be worth emitting some kind of warning for this case. Not sure what's the best way to do that in autode.
Hi 👋
over at mlptrain we're trying to update our autode version. Things were going smoothly until we noticed that since autode v1.3.4 we could no longer load certain
.npz
files, which are used in mlptrain to store the training data. Specifically, we're getting this confusing stacktrace:This is only happening for some older npz files. After some tedious debugging with @juraskov, we've identified the following conditions for the problematic npz files:
ConfigurationSet
classes that containedConfiguration
s with different number of atoms (e.g. differently solvated clusters)dtype.object
to some ndarrays.The
__setstate__
function which raises the exception extends the pickling protocol forautode.values.ValueArray
(which subclasses np.ndarray) and has been introduced as part of #215 to solve issue described in #221 (see also standalone PR #222).autodE/autode/values.py
Line 666 in 6a6eebe
It seems like the following patch solves the loading issue, insofar as access to the underlying ndarrays is concerned (the extra attributes are lost, but the underlying data are more important to recover).
It's not a pretty solution, but feels reasonably safe, since we would be basically falling back to the original
__setstate__
implementation. Happy to hear better ideas! Also happy to submit a PR.Unfortunately, on the mlptrain side the issue cannot be ignored since it affects a fair amount of existing data.
The text was updated successfully, but these errors were encountered: