Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BTree "the bucket being iterated changed size" #118

Closed
matthchr opened this issue Jan 10, 2020 · 40 comments · Fixed by #147
Closed

BTree "the bucket being iterated changed size" #118

matthchr opened this issue Jan 10, 2020 · 40 comments · Fixed by #147

Comments

@matthchr
Copy link

matthchr commented Jan 10, 2020

Versions:
Python 3.6

BTrees==4.6.0
ZODB==5.5.1

I have a BTrees.OOBTree.BTree in a ZODB which seems to have somehow become corrupted.

While our system was running we observed that the data the BTree was returning seemed suspicious (while iterating the tree, items which should have been there were not actually there). We took a snapshot of the tree and manually examined it, and determined that the structure of the BTree seems to have become corrupted.

When we try to iterate the tree (or its keys):

for key in tree.keys():
    print('key: {}'.format(key))

We get this error:

Traceback (most recent call last):
  File "/home/test.py", line 40, in <module>
    for key in tree.keys():
RuntimeError: the bucket being iterated changed size

Interestingly, as far as I can tell our live service was not getting this exception (but it definitely was also not returning all of the data it should have been) - I'm not sure why the discrepancy between what our live service was seeing with this DB versus what I see when I download it and look through it locally.

BTrees.check.check shows no issues.

Obviously, I am not modifying the tree while iterating over it in my debugging environment - but when I look deeper into the structure of the tree by iterating the buckets one by one I see this:

Bucket: BTrees.OOBTree.OOBucket([('0231ed5f-6a2a-4675-a75a-103c6818de95', <Data object at 0x7faba5d396d8 oid 0x265b9 in <Connection at 7fabaa263cc0>>), ('03697f69-500b-4f7f-8ff1-b05c7d85b6cc', <Data object at 0x7faba5d39748 oid 0x26480 in <Connection at 7fabaa263cc0>>), ('03ac8297-38bf-4148-bd4e-df43c1e3d08b', <Data object at 0x7faba5d397b8 oid 0x262b5 in <Connection at 7fabaa263cc0>>), ('03d44486-31a2-4ed7-be24-d755f3cbff2a', <Data object at 0x7faba5d39828 oid 0x25fdb in <Connection at 7fabaa263cc0>>)])
... (a bunch more buckets that all look fine)
Bucket: BTrees.OOBTree.OOBucket([('4480e440-2d21-45f8-a909-664a85330df7', <Data object at 0x7faba5d41978 oid 0x264ad in <Connection at 7fabaa263cc0>>), ('452fc9ad-2544-4385-a034-3120a69409e8', <Data object at 0x7faba5d419e8 oid 0x264c4 in <Connection at 7fabaa263cc0>>), ('4554ce23-1ef5-4ff3-bee2-eb9233243448', <Data object at 0x7faba5d41a58 oid 0x2612f in <Connection at 7fabaa263cc0>>), ('45f92417-c80f-481a-b0d7-60110d757591', <Data object at 0x7faba5d41ac8 oid 0x25ccf in <Connection at 7fabaa263cc0>>)])
Bucket: BTrees.OOBTree.OOBucket([])
Bucket: BTrees.OOBTree.OOBucket([])
Bucket: BTrees.OOBTree.OOBucket([])
Bucket: BTrees.OOBTree.OOBucket([])
Bucket: BTrees.OOBTree.OOBucket([('91660989-9343-4f72-a076-94c8fc6c155f', <Data object at 0x7faba5ce03c8 oid 0xf769 in <Connection at 7fabaa263cc0>>), ('9176559f-e650-47bf-812b-dc4e75eaf299', <Data object at 0x7faba5ce0438 oid 0x223fa in <Connection at 7fabaa263cc0>>), ('91cd51fc-e9bd-4730-843f-b2553a06e63d', <Data object at 0x7faba5ce04a8 oid 0x22e68 in <Connection at 7fabaa263cc0>>), ('91d1b461-77ba-42b6-930b-326d7b2badbf', <Data object at 0x7faba5ce0518 oid 0x26310 in <Connection at 7fabaa263cc0>>), ('92acce7e-4e15-45c0-b954-2c25f9325731', <Data object at 0x7faba5ce0588 oid 0x221b9 in <Connection at 7fabaa263cc0>>), ('92e5b508-84a1-404b-9692-da2a7696c762', <Data object at 0x7faba5ce05f8 oid 0x22d32 in <Connection at 7fabaa263cc0>>), ... )])
... (more buckets with data here)

From what I can tell, 0 size buckets in the "middle" of the bucket list is not expected and is what triggers the exception I'm getting. Additionally if you look closely, it seems like a large swath of buckets is lost, because there should be a relatively even distribution of GUIDs as keys (and I see that from first digit 0-4 and first digit 9-f), but first digit 5-8 is missing.

My questions are:

  1. Why doesn't BTrees.check.check notice this?
  2. What could cause this type of corruption?

It's possible that this is related to #68, although they look sufficiently different to me that I figured I would post a new issue. It is also possible that this only started happening in BTrees==4.6.0 as we moved to that version relatively recently (we were on BTrees==4.5.1 previously) - but given how rare this is occurring we may just not have noticed it until now...

@matthchr
Copy link
Author

With respect to the "lost" buckets, I just did:
r = tree['747b6e77-3fe6-4155-afe6-b217505f4215'] (which was one of the lost keys) and somehow that actually worked - I can't seem to see/find the bucket where this is stored though, but somehow if I look it up by specific index the BTree is able to get me the value. Not sure if that adds clarity or confuses things further...

@d-maurer
Copy link
Contributor

d-maurer commented Jan 11, 2020 via email

@matthchr
Copy link
Author

@d-maurer - Yes, after the previous issue we had, we significantly improved our locking. All BTree writes are now done under a lock - so it doesn't seem as likely that this is a simple corruption issue due to multi-threading (at least not with writes).

We also did observe that the previous issue, where the BTree would show us a key (in say, for item in tree) but be unable to access the value of that key when doing tree[key] seems to have been mitigated by locking.

There are a few possibilities we are wondering about in the same space though:

  1. We call zodb.pack() periodically, and that is not currently under a lock. Is it possible that we're introducing some sort of corruption by packing? My read of the pack documentation suggests that's probably unlikely but it's one avenue I'm investigating.
  2. Only writes are holding the lock right now, with the assumption that a write + read happening in parallel wouldn't corrupt the BTree - is that assumption flawed? Should we lock all reads and writes?

Also, from the previous issue you had suggested:

If multiple threads need access to the same persistent object, each one must get its own local copy by loading it (directly or indirectly) from a local (i.e. thread specific) ZODB connection.

Our application uses async/await as its primary form of asynchrony - which actually means really there's only one thread in play (the one running the asyncio loop). That thread is the only thread with a Connection at all (IIRC by default connections are associated to the thread they're opened on) and so I don't think there's any conflict resolution going on at that level.

I will definiately improve our logging to detect this case as well (check missed it so we'll need some specific tests for this I think). Am I correct in assuming that there should be no empty buckets in the bucket linked-list?

@jamadden
Copy link
Member

we significantly improved our locking

So you must be using gross-grained cross-process locking? Because writes can occur from any process at any time. That seems very expensive and counter-productive. Or you just have a single process...

Our application uses async/await as its primary form of asynchrony - which actually means really there's only one thread in play (the one running the asyncio loop). That thread is the only thread with a Connection at all (IIRC by default connections are associated to the thread they're opened on) and so I don't think there's any conflict resolution going on at that level.

Sorry, but if the application is sharing Persistent object instances across concurrency contexts it is unbelievable architecturally broken. That is to say, any object instance produced by a Connection must only ever be used by a single "thread" at a time (as must the Connection). Introducing any chance of doing otherwise is just not dreamed of by ZODB.

Perhaps I've misunderstood, and there are indeed extra guarantees in place in this architecture to ensure that the obvious errors of mixing concurrency and Connection don't happen. If that's the case, please forgive me.

@d-maurer
Copy link
Contributor

d-maurer commented Jan 14, 2020 via email

@matthchr
Copy link
Author

@jamadden asked:

So you must be using gross-grained cross-process locking? Because writes can occur from any process at any time. That seems very expensive and counter-productive. Or you just have a single process...

There's only a single process, a single asyncio event loop, and a single active Connection. Locks are just in-memory locks.

Sorry, but if the application is sharing Persistent object instances across concurrency contexts it is unbelievable architecturally broken. That is to say, any object instance produced by a Connection must only ever be used by a single "thread" at a time (as must the Connection). Introducing any chance of doing otherwise is just not dreamed of by ZODB.

asyncio only provides cooperative yielding/scheduling. Are you saying that the following is not allowed :

async def test_async(zodb_connection):
    await some_async_operation()
    zodb_connection.root['a'].persistent_thing.x = 1
    transaction.commit()

async def test_async_2(zodb_connection):
    await some_async_operation()
    zodb_connection.root['a'].persistent_thing.x = 2
    transaction.commit()

# Example usage
await test_async(conn)
...
await test_async_2(conn)

That is to say, any object instance produced by a Connection must only ever be used by a single "thread" at a time (as must the Connection).

Provided when you say "Thread" here you mean Python thread (and not some lower level primitive or logical concept), we aren't violating this - asyncio also only ever runs coroutines on a single Python thread. Interruptions can only occur when an await is reached and the interrupting coroutine is also run on the same thread.

@d-maurer says:

If the requirement is not fulfilled, you may loose important transaction features: as effects of different concurrent activities may get committed together leading to an inconsistent state. Another effect may be that some modifications by one concurrent activity is wiped out by an "abort" from a
different concurrent activity and the first current transaction makes further modifications only valid if its former modifications were effective - again leading to inconsistent state.

These are good points - as you say though these are application level inconsistencies which we manage:

  1. We don't ever abort currently
  2. Modification to any persistent object is accompanied by a transaction.commit() in the coroutine which made the modification
  3. There aren't any await's between the modification of a persistent object and a transaction being committed (it's done within a with block)

Taking a step back - the entire point of doing what we're doing is basically because opening/closing Connections frequently we observed to be expensive. It's not tenable for our application to open a fresh connection for every single coroutine where we may make a modification due to the overall performance impact of doing so - we open a connection, do many transactions/modifications and then close the connection (which triggers persistence to disk I believe) at a defined time.
My understanding was that this pattern, while strange, is not fundamentally against the architecture of ZODB/BTrees - is that not correct?

@d-maurer
Copy link
Contributor

d-maurer commented Jan 14, 2020 via email

@matthchr
Copy link
Author

We have finally been able to get a somewhat reliable repro of this issue... appreciate your leaving this issue open so long even with no activity.

We've found a way to generate load in our application which produces this issue somewhat consistently. I have shared a ZODB file with the full transaction history available (no packs have been done). I followed the advice given in #102 by @d-maurer and used dm.historical to look through the transaction history for the BTree in question.

Here's what I found (caveat that I am not an expert here so I was learning as I went):

  • The issue was hit in this BTree around 2020-08-19 23:38:53
  • The last good history entry (from dm.historical) was at 2020/08/19 23:38:10.622771 and had _p_estimated_size: 32192
  • The first "bad" entry (from dm.historical) was at 2020/08/19 23:38:12.419679 and had _p_estimated_size: 32000`
  • The bucket that becomes corrupt has oid b'\x00\x00\x00\x00\x00\x00\r\x91'

I took a deeper look at the difference in history between the last good history entry and the first bad one and what I observed was that we removed keys from 4 buckets between the two entries.
Specifically these keys:

bucket 118 differs
1 keys only in lhs {'testaccount1batchtest18 22F1F921291938C1$abs-2-job406589010-2 22F1E3E288AA12F8$job-1$abs-2-job406589010-2-task-63$0:662512081'}
bucket 119 differs
1 keys only in lhs ({'testaccount1batchtest18 22F1F921291938C1$abs-2-job406589010-2 22F1E3E288AA12F8$job-1$abs-2-job406589010-2-task-67$0:662512081'})
bucket 159 (this is the bucket that becomes corrupt) differs
1 keys only in lhs ({'testaccount1batchtest18 22F1F921291938C1$abs-2-job406589010-4 22F1E3E288A8B2FD$job-1$abs-2-job406589010-4-task-205$0:662512051'})
bucket 160 differs
2 keys only in lhs ({'testaccount1batchtest18 22F1F921291938C1$abs-2-job406589010-4 22F1E3E288A8B2FD$job-1$abs-2-job406589010-4-task-235$0:662512051', 'testaccount1batchtest18 22F1F921291938C1$abs-2-job406589010-4 22F1E3E288A8B2FD$job-1$abs-2-job406589010-4-task-234$0:662512051'})

The net of this is that basically we removed 5 keys from 4 buckets, with the key in bucket 159 being the last key. That bucket (oid b'\x00\x00\x00\x00\x00\x00\r\x91') didn't disappear like it was (I think) supposed to.

The ZODB containing the BTree in question is in this zip.
There are multiple BTrees in the DB: The one with the issue is: btree = conn.root()['agent.task.history'].history
I've posted the code I used to gather this data here.

Here are the details of our application and the situations that this reproduces in:

  • Python 3.7.6
  • Packaged using PyInstaller
  • ZODB is local for each instance of the application (no cross-process or anything). So basically only a single exe is interacting with the db ever, we're purely using ZODB/Btrees for local state persistence (i.e. if the process/machine restarts).
  • The application is entirely using Python async/await, so as discussed above there is some concurrency in the application but no true parallelism.
  • I've only ever seen repros of this problem in Windows.

I have tried to produce a synthetic test that reproduces this issue outside of the context of our application (no PyInstaller, etc), but I haven't been able to do so. I am hoping that somebody can take a look at the btree in question and maybe something jumps out at you about what specifically is going on with this transaction that is suspicious. With that information maybe I could write a test that reproduced the problem without all of the context/cruft that our application brings, since right now the only way I have to reproduce this is a long and somewhat cumbersome process that involves running our entire application for a while, driving particular load at it, and hoping the issue reproduces.

@matthchr
Copy link
Author

I also just now realized that @jamadden never actually clarified his usage of the word Thread w.r.t the comments previously - it could be that our application is unbelievably architecturally broken, if that's the case I'd like to know!

@d-maurer
Copy link
Contributor

d-maurer commented Aug 27, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Aug 27, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Aug 28, 2020 via email

@d-maurer
Copy link
Contributor

...
From what I can tell, 0 size buckets in the "middle" of the bucket list is not expected and is what triggers the exception I'm getting. ...
My questions are:

1. Why doesn't `BTrees.check.check` notice this?

This is checked (among others) by the _check method of btrees, not by the check function.

@d-maurer
Copy link
Contributor

d-maurer commented Aug 28, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Aug 28, 2020 via email

@d-maurer
Copy link
Contributor

I will see whether I can reproduce the behavior on my linux platform.

I was unable to emulate the application specific modules/classes -- a requirement for the reproduction on a foreign system. @matthchr can you provide such an emulation (or even the true code)?

@d-maurer
Copy link
Contributor

The following code checks deletion of a bucket:

from BTrees.OOBTree import OOBTree, OOBucket
from copy import deepcopy

b2 = OOBucket(dict.fromkeys(range(20, 22)))
b1 = OOBucket()
b1.__setstate__(((10, None), b2))
b0 = OOBucket()
b0.__setstate__(((0, None), b1))
t = OOBTree()
t.__setstate__(((b0, 10, b1, 20, b2), b0))
t._check()
tc = deepcopy(t)
del tc[10]
tc._check()

No "damaged bucket chain" -- at least not under Linux.

@matthchr
Copy link
Author

@d-maurer thanks for the in-depth investigation. I hadn't realized that _check did something different than BTrees.check.check

One thing I noticed when walking through the history myself is that my dm.historical is showing me far fewer history entries than you seem to be seeing. You mentioned seeing 5k+, but I only see ~500. What version of dm.historical are you using? I'm using version 2.0.1, and I think that this version is missing certain history entries. I wrote a simple test where I just commited 2000 transactions (each labeled with a number), and while the first 30 or 40 it had each one of, after that it seemed to skip a bunch between each history entry (i.e. one entry would be txn 67, the next 93).

No "damaged bucket chain" -- at least not under Linux.

I ran the code snippet you shared on my Windows environment and didn't see any issue either.

I was unable to emulate the application specific modules/classes -- a requirement for the reproduction on a foreign system. @matthchr can you provide such an emulation (or even the true code)?

I haven't been able to produce a paired down example which can emulate this problem, and unfortunately I can't share the actual code in question.

Before we get too far, we should revisit this:

This is potentially problematic -- unless you ensure
that each coroutine uses its own ZODB connection
and the transactions are coroutine specific.

The transaction manager used by ZODB connections by default
are thread, not coroutine specific. Thus, you would need
to pass a special transaction manager when you open your
ZODB connections.

Failure to ensure those may lead to inconsistent
persistent data. I am not sure whether it can lead to
a physically damanage tree.

We are not using a coroutine specific transaction manager when we open connections - I was (mistakenly?) under the assumption that since we weren't actually multi-threaded, we could just use the default transaction manager. As you mentioned, that defaults to a transaction per-thread - since we only have one thread there is at most a single transaction ongoing at a given time.

Can you explain a bit more about the mechanics of how this is problematic? I am failing to understand what the difference is in these two pieces of code from the perspective of ZODB/BTrees:

async def test(btree):
    await t1(btree)
    await t2(btree)
    await t3(btree)
    transaction.commit()


async def t1(btree):
    btree['foo'] = 5


async def t2(btree):
    btree['bar'] = 6


async def t3(btree):
    btree['baz'] = 7
def test2(btree):
    btree['foo'] = 5
    btree['bar'] = 6
    btree['baz'] = 7
    transaction.commit()

Obviously this example is overly simplistic because it assumes that execution in an async context is always well-ordered, which it's not. Even with a more complicated case where we're reacting to some outside event (a web request or something), such that there may be multiple calls ongoing and order is not guaranteed, I still am not seeing how that's any different than making a bunch of sequential changes and commits in a non-async context.

One issue (which I do understand), is that if we do something like this:

def a():
    # modify some btree state (part 1)
    await <something>
    # modify some more btree state (part 2)
    transaction.commit()

and have the expectation that both part 1 and part 2 will always be committed together, that is not correct (because we awaited in the middle and something else could have committed during that await). I don't understand how this could corrupt the btree though (because as far as ZODB/BTrees is concerned we've just done a bunch of sequential commits).

Can you clarify if you think that our lack of coroutine-specific transactions is the cause of this corruption, or if you are just cautioning against that because of possible logic errors?

@d-maurer
Copy link
Contributor

d-maurer commented Aug 28, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Aug 28, 2020 via email

@jamadden
Copy link
Member

because there should be no coroutine switch during a single btree operation only between btree operations

That's not out of the realm of possibility, though, is it? If the objects you're using as keys wind up switching in their __cmp__ (__eq__, __lt__) methods, then what's supposedly sequential C code, protected by the GIL, is not. Your point about object destruction is also a good one: A __del__ method implemented in Python can do just about anything, so a line of code like btree['a'] = object() could have whatever side-effect that btree['a'].__del__ does; that might include switching. I wonder if we're looking at a case of re-entrancy (a btree method switching out, part way through, and then being called again by another operation).

(I'm far more familiar with gevent than with async/await. This is one place that the explicitness of async/await comes in handy. It's probably tough to arrange for switches in those methods without some trouble and several layers. )

You issue is almost surely related to #68 -- and that almost surely has been observed in a standard multi threaded application.

In fact, it's actually a gevent-based application, but it's using almost exclusively the standard-library as monkey-patched by gevent, and greenlets are treated just like threads: one Connection and transaction per request per greenlet. Persistent objects are never shared between "threads" of control without a great deal of care and following some rules (rule #1: Don't do that!) At least they're not supposed to be, so after this conversation I'm wondering if that might have been the root cause of #68.

@matthchr
Copy link
Author

@d-maurer Thanks for your explanation - I understand now.

@jamadden AFAIK it's not possible to actually yield control from a coroutine without an await, and I am not aware of a way to use that keyword in a method such as __eq__, __lt__, etc. Certainly we aren't doing that.

What you do is not right -- but it is not the cause of the observed damage.

Thanks for the clarification - I will look at making changes to use a connection/txn per-coroutine (I agree that there are clear flaws with what we're doing now, although we're managing the application inconsistencies that may arise from it), but good to hear that it shouldn't be causing corruption.

I do not need the real code - just the relevant module/class structure.

In that case, I may be able to help. Can you share the full snippet of code that you're trying to execute (with the mock object hierarchy + copy + usage of dm.historical?) If I get that I can probably fix it up and share it back with you if I can get it working

@jamadden
Copy link
Member

AFAIK it's not possible to actually yield control from a coroutine without an await, and I am not aware of a way to use that keyword in a method such as eq, lt, etc. Certainly we aren't doing that.

Yeah, it wouldn't be directly done, it would probably be something buried under layers of function calls. This simple example shows control going to different coroutines while inside of an __eq__ method. If that happened during a BTree operation, and one of the switched-to coroutines was also using the BTree, I'd expect problems.

import asyncio
from BTrees import family64

async def coro2():
    print("In coro 2")
    print("Len of bt", len(bt))

async def thing():
    print("-> Enter coro 1")
    await coro2()
    print("<- Exit coro 1")

def do_thing():
    asyncio.run(thing())

class Foo:

    def __eq__(self, other):
        print("In __eq__")
        import traceback; traceback.print_stack()
        do_thing()
        return True

    __lt__ = __eq__

bt = family64.OO.BTree()
bt[Foo()] = 1
bt[Foo()] = 2
$python /tmp/atest.py
In __eq__
  File "/tmp/atest.py", line 28, in <module>
    bt[Foo()] = 2
  File "/tmp/atest.py", line 20, in __eq__
    import traceback; traceback.print_stack()
-> Enter coro 1
In coro 2
Len of bt 1
<- Exit coro 1
$ PURE_PYTHON=1 python /tmp/atest.py
In __eq__
  File "/tmp/atest.py", line 28, in <module>
    bt[Foo()] = 2
  File "//lib/python3.8/site-packages/BTrees/_base.py", line 819, in __setitem__
    self._set(self._to_key(key), self._to_value(value))
  File "//lib/python3.8/site-packages/BTrees/_base.py", line 949, in _set
    result = child._set(key, value, ifunset)
  File "//lib/python3.8/site-packages/BTrees/_base.py", line 350, in _set
    index = self._search(key)
  File "//lib/python3.8/site-packages/BTrees/_base.py", line 125, in _search
    if k is key or k == key:
  File "/tmp/atest.py", line 20, in __eq__
    import traceback; traceback.print_stack()
-> Enter coro 1
In coro 2
Len of bt 1
<- Exit coro 1

@d-maurer
Copy link
Contributor

d-maurer commented Aug 29, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Aug 29, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Aug 29, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Aug 31, 2020 via email

@matthchr
Copy link
Author

matthchr commented Sep 4, 2020

I've fixed up your emulation code I believe. The main issue was that TaskKind is an enum.Enum:

Full code that ran without error for me:

from ZODB.DB import DB
from ZODB.FileStorage import FileStorage
from transaction import commit
import enum

from types import ModuleType
m = ModuleType("agent.task.history")
m.task = m.history = m.defs = m
from sys import modules
modules[m.__name__] = modules["agent"] = modules["agent.task"] = modules["agent.task.defs"] = m
from persistent import Persistent
class TaskHistoryInfo(Persistent): pass

TaskHistoryInfo.__module__ = "agent.task.history"
m.TaskHistoryInfo = TaskHistoryInfo

class TaskHistoryData(Persistent): pass

TaskHistoryData.__module__ = "agent.task.history"
m.TaskHistoryData = TaskHistoryData

class TaskKind(enum.Enum):
    A = 0
    B = 1
    C = 2
    D = 3

TaskKind.__module__ = "agent.task.history"
m.TaskKind = TaskKind

s=FileStorage("agent.db")
db=DB(s)
c=db.open()
t=c.root()['agent.task.history'].history
from dm.historical import *
th=generateBTreeHistory(t)
h = list(th); h.reverse()
good = h[5185]["obj"]; bad = h[5186]["obj"]

from tempfile import TemporaryFile
with TemporaryFile() as f:
    good._p_jar.exportFile(good._p_oid, f)
    f.seek(0)
    gc = c.importFile(f)

commit()

I tried investigating how you suggested (diffing the two trees to see what changed). I used the following function:

def compare_btrees(lhs, rhs):

    lhs_keys = {key for key in lhs}
    rhs_keys = {key for key in rhs}

    keys_only_in_lhs = lhs_keys - rhs_keys
    keys_only_in_rhs = rhs_keys - lhs_keys
    if any(keys_only_in_lhs):
        print('{} keys only in lhs ({})'.format(
            len(keys_only_in_lhs), keys_only_in_lhs))
    if any(keys_only_in_rhs):
        print('{} keys only in rhs ({})'.format(
            len(keys_only_in_rhs), keys_only_in_rhs))

From what I can tell, The 5185 -> 5186 BTree diff consisted of the following change:

1 keys only in lhs ({'testaccount1batchtest18 22F1F921291938C1$abs-2-job406589010-4 22F1E3E288A8B2FD$job-1$abs-2-job406589010-4-task-182$0:662512005'})

I tried doing the following which didn't seem to trigger any errors:

with TemporaryFile() as f:
    good._p_jar.exportFile(good._p_oid, f)
    f.seek(0)
    gc = c.importFile(f)
    gc._check()
    gc.pop('testaccount1batchtest18 22F1F921291938C1$abs-2-job406589010-4 22F1E3E288A8B2FD$job-1$abs-2-job406589010-4-task-182$0:662512005')
    gc._check()

I tried on my Linux box though - I will try on my Windows box too and see if there is any difference.

@d-maurer
Copy link
Contributor

d-maurer commented Sep 4, 2020 via email

@matthchr
Copy link
Author

matthchr commented Sep 4, 2020

I tried on my Windows machine and actually I am getting an error here (I believe on the list(th))

th=generateBTreeHistory(t)
h = list(th); h.reverse()

The error is:

C:\work\venv-test\Scripts\python.exe C:/work/src/temp2.py
Traceback (most recent call last):
  File "C:\work\venv-test\lib\site-packages\dm\historical\__init__.py", line 60, in generateHistory
    if not history: raise StopIteration()
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:/work/src/temp2.py", line 53, in <module>
    h = list(th); h.reverse()
  File "C:\work\venv-test\lib\site-packages\dm\historical\__init__.py", line 129, in generateBTreeHistory
    if a.next() is None: heappop(heap)
  File "C:\work\venv-test\lib\site-packages\dm\historical\__init__.py", line 100, in next
    try: value = next(self.gen)
RuntimeError: generator raised StopIteration

Any idea why that would be?

@d-maurer
Copy link
Contributor

d-maurer commented Sep 4, 2020 via email

@jamadden
Copy link
Member

jamadden commented Sep 4, 2020

Any idea why that would be?

An incompatibility of dm.historical with newer Python versions.
The error does not occur with Python 3.6, it occurs with Python 3.9
(and may have been introduced with a Python version in between).

Python 3.7 adopted PEP 479 "meaning that StopIteration exceptions raised directly or indirectly in coroutines and generators are transformed into RuntimeError exceptions."

@matthchr
Copy link
Author

matthchr commented Sep 4, 2020

@jamadden has it -- I was on 3.6 on my Linux but 3.7 on Windows.

Even with dm.historical==2.0.3 I am still getting an error with 3.7 - @d-maurer are you sure you've fixed it in 2.0.3?

Traceback (most recent call last):
  File "C:\work\venv\lib\site-packages\dm\historical\__init__.py", line 131, in generateBTreeHistory
    raise StopIteration()
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:/work/temp2.py", line 53, in <module>
    h = list(th); h.reverse()
RuntimeError: generator raised StopIteration

Process finished with exit code 1

I reverted to Python 3.6 and everything passed in Windows as well, so it looks like probably no reliable repro... That doesn't necessarily surprise me as this issue has been really hard for us to pin down as well (our first thoughts were things like threading issues, etc)...

Are there next steps that we could take to try to pin down what's going on here?

@d-maurer
Copy link
Contributor

d-maurer commented Sep 5, 2020 via email

@matthchr
Copy link
Author

Have you made the Windows trial with the emulation or your real code?

I tried both -- our real code doesn't define any special destructors (but I tried it anyway).
I also tried 3.7 with dm.historical 2.0.4 (just to confirm that somehow the updated version of Python wasn't causing anything). For that also there was no repro.

One could run
the application inside a debugger and put a breakpoint on the error exit
and then try to reproduce the problem. When it occurs, hopefully,
the breakpoint is hit and one can (again hopefully) determine what caused
the error.

I'm investigating two angles right now:

  1. Running the application under a debugger and placing some breakpoints when I get a repro.
  2. Coming up with a more minimal repro that can be shared.

For 1, is there a document or instructions on performing this process? It would have to be in Windows (the only environment I can get a repro). I am somewhat familiar with windbg, but have never debugged a native Python module loaded by Python before. How can I get the symbols for the native code - are they included in the .whl for Windows? (I don't think I saw them but maybe missed when I was looking before).

@d-maurer
Copy link
Contributor

d-maurer commented Sep 16, 2020 via email

@matthchr
Copy link
Author

matthchr commented Nov 6, 2020

We were able to get a minimal repro of this issue and determine why we were only seeing the issue in Windows. You can run the repro for yourself, I've posted the code on GitHub here. The problem also repros in Linux.

The key details (this is basically contained in the readme of that repo as well):

  1. We we only seeing the problem on Windows because we were mistakenly using the Python version of BTrees rather than the native version on Windows. The bug only happens in the Python version of BTrees.
  2. The repro happens on the latest version of BTrees (4.7.2).
  3. It takes quite a while to hit the repro (see readme.md for more details). Hopefully you can look at the pattern we're doing and spot a key bit that may help you improve the speed at which the repro is hit (possibly write a deterministic test that can cause it?)

Because there is no issue with the native version of BTrees we've been able to work around this problem by ensuring that we're using the native version of BTrees.

@d-maurer
Copy link
Contributor

d-maurer commented Nov 6, 2020 via email

@d-maurer
Copy link
Contributor

d-maurer commented Nov 6, 2020

I was able to isolate (and quickly reproduce) the problem. The code below depends on your agent.db.

from pprint import pprint as pp
from ZODB.DB import DB
from ZODB.FileStorage import FileStorage
import enum

from types import ModuleType
m = ModuleType("agent.task.history")
m.task = m.history = m.defs = m
from sys import modules
modules[m.__name__] = modules["agent"] = modules["agent.task"] = modules["agent.task.defs"] = m
from persistent import Persistent
class TaskHistoryInfo(Persistent): pass

TaskHistoryInfo.__module__ = "agent.task.history"
m.TaskHistoryInfo = TaskHistoryInfo

class TaskHistoryData(Persistent): pass

TaskHistoryData.__module__ = "agent.task.history"
m.TaskHistoryData = TaskHistoryData

class TaskKind(enum.Enum):
  A = 0
  B = 1
  C = 2
  D = 3

TaskKind.__module__ = "agent.task.history"
m.TaskKind = TaskKind



s=FileStorage("agent.db")
db=DB(s)
c=db.open()
t=c.root()['agent.task.history'].history
from dm.historical import *
from transaction import commit
from tempfile import TemporaryFile

bad = getObjectAt(t, b'\x03\xda\x0bi 2Wf')
good = getObjectAt(t, b'\x03\xda\x0bi\x02\xa9\xce\xbb')
key = 'testaccount1batchtest18 22F1F921291938C1$abs-2-job406589010-4 22F1E3E288A8B2FD$job-1$abs-2-job406589010-4-task-182$0:662512005'
with TemporaryFile() as f:
  good._p_jar.exportFile(good._p_oid, f)
  f.seek(0)
  gc = c.importFile(f)

c.root()["good"] = gc; commit()
gc._check()
del gc[key]; gc._check() # still good
commit()
gc._p_deactivate(); gc._check() # bad

This means that the problem is not caused by the deletion operation itself. Instead, the tree structure gets broken via storing/reloading -- almost surely due to a missing _p_changed=True.

And indeed: there are 2 missing _p_changed = True:

--- _base.py~	2020-03-11 09:12:06.746867179 +0100
+++ _base.py	2020-11-06 23:56:57.799589569 +0100
@@ -100,6 +100,7 @@
     def _deleteNextBucket(self):
         next = self._next
         if next is not None:
+            self._p_changed = True
             self._next = next._next
 
     def _search(self, key):
@@ -1034,6 +1035,7 @@
                     self._firstbucket = child._next
                     removed_first_bucket = True
             del data[index]
+            self._p_changed = True
 
         return removed_first_bucket, value

With these additions, the problem disappears.

@d-maurer
Copy link
Contributor

d-maurer commented Nov 6, 2020

The first _p_changed is actually not necessary (as Bucket derives from Persistent).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants