-
-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update code for better backward and forward compatibility #476
base: master
Are you sure you want to change the base?
Conversation
a8b0077
to
3c2dbda
Compare
Actually, I was trying to remember why I didn't do this before because I would have done it if I had thought of it. I don't believe that keeping perfect compatibility is possible. Either pickles you write now will be unable to be opened in older versions of dill or pickles from older versions of dill cannot be opened now. Props to you if you get it to work in a way that supports both backward and forward compatibility, but this PR as it stands loses forward compatibility of old pickle files. I am looking into a way to both fix the bug you pointed out (#466) and make the function pickling code more readable and simple. Unfortunately, I am blocking on my work and don't entirely have time to look into it. You can pick it up if you are interested in fixing the pickling of functions (which you kind of seem, considering this PR.) I will be able to finish it up in 3 weeks anyway, so it doesn't really matter if you pick up or don't. |
You are 100% right, my change added backward compatibility, but removed forward compatibility. I think I did it too late last night to figure... But why couldn't we keep Talking about compatibility, I'm playing with a "portable mode" that could bootstrap Also thinking of a decorator for the constructor functions to unpack positional args, ignoring any extras, as a future-proof feature. It would avoid compatibility breakage as what happened to |
Yep. I was looking into the same plan that you described in order to create a clear separate between the private functions in Actually, strangely enough, I would note that you would want a feature like this to be a separate setting since it would bloat otherwise small pickles containing just one function to contain all of the create functions, which I would guesstimate would add ~10kB to the size of the pickle. If someone had tens of thousands of pickles, this could balloon to include tens of thousands of copies which would add up to a majority of the space taken up by the pickles. This is not always a bad thing. Pickling |
I have an idea that would work, but this is on the bottom of my priorities. It is to create a new option that uses the Line numbers are a little difficult. Using the string directly would save space, but I don't know how to change the line numbers during compilation, but saving the AST balloons the size way too much, requiring hundreds of bytes for a single AST node, being way too inefficient for any practical purpose. I hope that that gives you a good starting point if you care to pursue this farther. The best solution I could think of was prepending newlines before the string until the line number is met. Inelegant, but works if you want to try it. The problem that I didn't quite figure out is how to compile inner functions, which are compiled differently than outer functions. |
Yes, I also think that even if constructor functions are internal in a sense, they should be considered public, not private. They should be named without a leading underscore and exported to the root module
These functions are not exactly an implementation detail, and changes to this internal API should be rare and incremental. The functions could be renamed with the underscore stripped out, but an alias with the old name should be kept in
In reality, my suggestion is to move just Note: If you are thinking of alleviating the extension and complexity of the main submodule
There are a few strategies that I'm thinking of:
Of course, this should be optional. I'm thinking of a "portable" setting/option. Plus, at least for simple objects, it should be possible to include just the necessary constructors other than |
3c2dbda
to
1476818
Compare
RTFD hides names beginning with
👍
Yep. That would be preferable and should probably be added. The problem is that it could not be called
I did not think about this before, but I agree that saving the version info would be a useful thing to do whenever we have a code object present.
Unfortunately, the names in the signature don't 100% line up with the attribute names, as frustrating as it is.
👍 I like your observations and insights. The problem with getting rid of |
dill/_dill.py
Outdated
@@ -1905,6 +1889,7 @@ def save_function(pickler, obj): | |||
# recurse to get all globals referred to by obj | |||
from .detect import globalvars | |||
globs_copy = globalvars(obj, recurse=True, builtin=True) | |||
globs_copy.setdefault('__builtins__', globals()['__builtins__']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you are asking me... I didn't write the line you are referring to, @leogama did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mmckerns, This is just a variation of
Lines 770 to 771 in a16219e
if "__builtins__" not in func.__globals__: | |
func.__globals__["__builtins__"] = globals()["__builtins__"] |
But I didn't understand the question either. By default, globals()['__builtins__']
in a module is "an alias for the dictionary of the builtins
module itself". It could be substituted by the __builtin__
alias in _dill
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. So, unless I'm forgetting something... no reason. I believe __builtin__
could have been used here without issue, as that should point to the same module object as the current code.
9433c89
to
bee762e
Compare
Even though Python 3.7 will be supported for some time, this (maybe) more reliable method could be used for 3.8+ only, falling back to the second one I described for Python <3.8. The overhead should be minimal. |
I've tried the first commit with the old dill 0.3.0 and noted that it (actually, the |
What do you think, for every new (*) Actually, it wouldn't work because pickle would save it as from functools import wraps
@wraps(CodeType, assigned=('__doc__', '__annotations__'), updated=())
def _create_code(*args):
return CodeType(*args) |
Yeah. I think Mike was forced into a brick wall on that one. There was no way to maintain compatibility, so he had to break it. Now that the functionality is working, even though it may not be elegant or sustainable, compatibility should not be broken because people will notice. You can look through the issues and see that there were a handful that came up because of incompatible changes like that one, and that is to be avoided. |
Actually, a mechanism similar to what you are intending is outlined in Also, the sort of breaking change presented by |
Changes possible thanks to recent improvements in the code base.