-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moderately large BTrees fail to pickle with RuntimeError: maximum recursion depth exceeded
#44
Comments
I'm running into this right now. There was some work, eight years ago, to have pickle not use python's call stack, but instead use a deque object. This would have allowed for pickling of deeply nested structures. http://bugs.python.org/issue3119 I'm looking to see if anyone has an implementation of this idea, and I'll update this issue if I find one. |
Alternately, it might be possible do take an approach similar to https://github.com/google/pygtrie/blob/master/pygtrie.py#L165-L252 that flattens out the structure in @jamadden, @tseaver would you all consider such an approach. It would be very good for folks like me that want to use BTrees outsize of zope. But, I'm not sure if it would be a meaningful deoptimzation for zope. If you think it's a possible path, I might start working on a PR. |
That makes complete sense, @tseaver. Because I am taking advantage of some of the facilities of zope.index, I might purse a fork of BTrees. If I get something working, we can revisit the question with concrete examples. |
I wonder what use cases is motivating this. |
Hi Jim, Here's my use case: On Fri, Nov 11, 2016 at 8:09 AM Jim Fulton [email protected] wrote:
|
There's nothing in the document that indicates to me a need to pickle a BTree. Is there some implicit assumption that you aren't using ZODB? If so, why dat? :) |
No, I'm not using ZODB, because I've never gotten it to work with my On Fri, Nov 11, 2016 at 9:27 AM Jim Fulton [email protected] wrote:
|
That's a bummer. If you feel like mentioning the issues you ran into, maybe on the ZODB list, I'd be interested to see if Ican help. ZODB (or some of its machinery) seem like a good fit for this use case as they inherently break BTrees up and pickle the parts separately. It's a little hard to say though because I don't know what you're doing with the pickle. |
If ZODB isn't involved, it it practical to pickle BTree items? |
Example:
You can of course call
sys.setrecursionlimit
to make this go away...up to a point, at which time your Python segfaults with something like this (I think_pickle.so
here is from zodbpickle):The exact number of items it takes for this to happen depends on the BTree type; an IIBTree, for example, can hold more items before it gets here (different bucket sizes?). I suspect it also depends on the system and the word size. On my system the limit is somewhere under 15,000 items for the
LOBTree
above (where even the values are primitive types).This is a consequence of the way a BTree is composed of buckets which are composed of other buckets...and their stored state all boils down to tuples:
This works out fine when used in the context of ZODB because the sub-buckets are replaced with persistent object identifiers. It just means that large-ish BTree can't be pickled outside of ZODB or some similar system.
I doubt anything can be done about this without major redesign and breaking compatibility, but maybe someone will have an idea. And maybe it's worth a mention in some docs somewhere? (I also wanted to leave this here for Google's sake.)
(For what it's worth, this isn't actually a problem for me. It came up in the context of testing some cache persistence strategies for RelStorage. Pickling a single large dict with ~600K keys in it is very slow prior to pickle protocol 4, so I wondered if BTrees might work better. Because of this they didn't, and I didn't want to role a mini-persistent object system, so I went a different direction.)
The text was updated successfully, but these errors were encountered: