-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BTree "the bucket being iterated changed size" #118
Comments
With respect to the "lost" buckets, I just did: |
Matthew Christopher wrote at 2020-1-10 14:58 -0800:
...
>From what I can tell, 0 size buckets in the "middle" of the bucket list is not expected and is what triggers the exception I'm getting. Additionally if you look closely, it seems like a large swath of buckets is lost, because there should be a relatively even distribution of GUIDs as keys (and I see that from first digit 0-4 and first digit 9-f), but first digit 5-8 is missing.
**My questions are:**
1. Why doesn't `BTrees.check.check` notice this?
Likely an omission.
2. What could cause this type of corruption?
Your report reminds me of an earlier report. That, too, reported
a `BTrees` corruption - involving concurrency.
I suggested to use `dm.historical` to find which transaction
introduced the corruption (Zope 2 usually writes sufficient
transaction metadata to get a feeling what BTrees operations could
have been performed). Of course, this requires that the corruption
was introduced after the last (effective) pack time.
|
@d-maurer - Yes, after the previous issue we had, we significantly improved our locking. All BTree writes are now done under a lock - so it doesn't seem as likely that this is a simple corruption issue due to multi-threading (at least not with writes). We also did observe that the previous issue, where the BTree would show us a key (in say, There are a few possibilities we are wondering about in the same space though:
Also, from the previous issue you had suggested:
Our application uses I will definiately improve our logging to detect this case as well ( |
So you must be using gross-grained cross-process locking? Because writes can occur from any process at any time. That seems very expensive and counter-productive. Or you just have a single process...
Sorry, but if the application is sharing Persistent object instances across concurrency contexts it is unbelievable architecturally broken. That is to say, any object instance produced by a Perhaps I've misunderstood, and there are indeed extra guarantees in place in this architecture to ensure that the obvious errors of mixing concurrency and |
Matthew Christopher wrote at 2020-1-13 13:59 -0800:
@d-maurer - Yes, after the previous issue we had, we significantly improved our locking. All BTree writes are now done under a lock - so it doesn't seem as likely that this is a simple corruption issue due to multi-threading (at least not with writes).
The ZODB was designed to be used without explicit locking.
Thus, if used correctly, locking should not be necessary.
We also _did_ observe that the previous issue, where the BTree would show us a key (in say, `for item in tree`) but be unable to access the value of that key when doing `tree[key]` seems to have been mitigated by locking.
There are a few possibilities we are wondering about in the same space though:
1. We call `zodb.pack()` periodically, and that is _not_ currently under a lock. Is it possible that we're introducing some sort of corruption by packing? My read of the `pack` documentation suggests that's probably unlikely but it's one avenue I'm investigating.
Packing should not introduce corruptions. If anything, it may only
loose whole transactions and be unable to corrupt the internal
state of persistent objects.
2. Only writes are holding the lock right now, with the assumption that a write + read happening in parallel wouldn't corrupt the BTree - is that assumption flawed? Should we lock all reads and writes?
As indicated above, no locks should be necessary.
Also, from the previous issue you had suggested:
> If multiple threads need access to the same persistent object, each one must get its own local copy by loading it (directly or indirectly) from a local (i.e. thread specific) ZODB connection.
Our application uses `async/await` as its primary form of asynchrony - which actually means really there's only one thread in play (the one running the asyncio loop). That thread is the only thread with a `Connection` at all (IIRC by default connections are associated to the thread they're opened on) and so I don't think there's any conflict resolution going on at that level.
What I have formulated above for threads is actually a requirement
for concurrent activities controlling a transaction.
In the standard setup, a transaction is bound to a controlling thread.
In your case, however, your coroutines are likely the concurrent
activities controlling the transactions and must fulfill the
requirement.
If the requirement is not fulfilled, you may loose important
transaction features: as effects of different concurrent
activities may get committed together leading to an inconsistent
state. Another effect may be that some modifications by one
concurrent activity is wiped out by an "abort" from a
different concurrent activity and the first current transaction
makes further modifications only valid if its former modifications
were effective - again leading to inconsistent state.
The inconsistencies mentioned above are application level inconsistencies.
I do not think that not fulfilling the requirement can cause
data structure level inconsistencies. **HOWEVER** I have seen
(very rare) cases where **READING** (!) a BTree structure
from a foreign thread has caused a SIGSEGV: apparently, some (internal)
"BTree" operations are performed without holding the GIL and interfering
reads can be desastrous.
However, coroutine concurrency should not introduce such behaviour (as
it does not interrupt internal BTree operations).
|
@jamadden asked:
There's only a single process, a single asyncio event loop, and a single active
asyncio only provides cooperative yielding/scheduling. Are you saying that the following is not allowed :
Provided when you say "Thread" here you mean Python thread (and not some lower level primitive or logical concept), we aren't violating this - asyncio also only ever runs coroutines on a single Python thread. Interruptions can only occur when an @d-maurer says:
These are good points - as you say though these are application level inconsistencies which we manage:
Taking a step back - the entire point of doing what we're doing is basically because opening/closing |
Matthew Christopher wrote at 2020-1-14 09:08 -0800:
...
Taking a step back - the entire point of doing what we're doing is basically because opening/closing `Connection`s frequently we observed to be expensive.
I am using the ZODB in the context of Zope. In this context, a
(at least one, sonmetimes several) connection is opened/closed
for each request. I have not observed that this causes a significant
overhead for request processing.
Note that "ZODB.DB.DB" uses a connection pool to make opening
new connections efficient (the primary purpose is to reuse
the cache associated with the connection; the `open` itself is
fast anyway). You should see significant degradations only
when you open more concurrent connections than the pool size.
...
My understanding was that this pattern, while strange, is not fundamentally against the architecture of ZODB/BTrees - is that not correct?
Reading persistent objects, too, may (or may not) be important for
application level consistency as explained below.
For a transactional system, the notion of "transaction isolation"
is important. This "measures" the degree of transaction independence.
The best is "serial": with this isolation, the execution of a set
of transactions either fails or has the same effect as some serial
(rather than concurrent) execution of the transactions.
The ZODB implements a weaker transaction isolation: so called
"snapshot isolation". In this model, a transaction sees the state
when the transaction started. Conflicts introduced by concurrent
modifications to individual objects are recognized during commit
and cause some affected transaction[s] to be aborted.
If you have two coroutines C1 and C2 with:
C1:
...
... read persistent object ...
...
... context switch ...
...
... read persistent object ...
and
C2:
...
... write persistent object ...
...
commit
... context switch ...
then "C1" may read inconsistent data. The first "read" reads state
before the commit; the second after the commit.
Such inconsistent reads may or may not be a problem for your
application. In no case should they be responsible for the physical
corruption you have observed.
Recently, I have fixed a `BTree` inconsistency in a system of a client.
The system has been operational for about 10 years and this has been
the first detected inconsistency. I tend to think that this might
have been the effect of less than complete reliability: computers
and storage media are highly reliable but nevertheless,
very rarely, errors may happen. If such an error affects
persistent data, its effect will remain...
Maybe, you, too, see the effect of a random error?
At your place, I would fix the corrupt `BTree` and start worrying
only should a similar problem reoccur in a short to medium timeframe.
|
We have finally been able to get a somewhat reliable repro of this issue... appreciate your leaving this issue open so long even with no activity. We've found a way to generate load in our application which produces this issue somewhat consistently. I have shared a ZODB file with the full transaction history available (no packs have been done). I followed the advice given in #102 by @d-maurer and used Here's what I found (caveat that I am not an expert here so I was learning as I went):
I took a deeper look at the difference in history between the last good history entry and the first bad one and what I observed was that we removed keys from 4 buckets between the two entries.
The net of this is that basically we removed 5 keys from 4 buckets, with the key in bucket 159 being the last key. That bucket (oid The ZODB containing the BTree in question is in this zip. Here are the details of our application and the situations that this reproduces in:
I have tried to produce a synthetic test that reproduces this issue outside of the context of our application (no PyInstaller, etc), but I haven't been able to do so. I am hoping that somebody can take a look at the btree in question and maybe something jumps out at you about what specifically is going on with this transaction that is suspicious. With that information maybe I could write a test that reproduced the problem without all of the context/cruft that our application brings, since right now the only way I have to reproduce this is a long and somewhat cumbersome process that involves running our entire application for a while, driving particular load at it, and hoping the issue reproduces. |
I also just now realized that @jamadden never actually clarified his usage of the word Thread w.r.t the comments previously - it could be that our application is unbelievably architecturally broken, if that's the case I'd like to know! |
Matthew Christopher wrote at 2020-8-26 15:30 -0700:
We have finally been able to get a somewhat reliable repro of this issue... appreciate your leaving this issue open so long even with no activity.
...
- The application is entirely using Python async/await, so as discussed above there is some concurrency in the application but no true parallelism.
This is potentially problematic -- unless you ensure
that each coroutine uses its own ZODB connection
and the transactions are coroutine specific.
The transaction manager used by ZODB connections by default
are thread, not coroutine specific. Thus, you would need
to pass a special transaction manager when you open your
ZODB connections.
Failure to ensure those may lead to inconsistent
persistent data. I am not sure whether it can lead to
a physically damanage tree.
|
Matthew Christopher wrote at 2020-8-26 15:31 -0700:
I also just now realized that @jamadden never actually clarified his usage of the word Thread w.r.t the comments previously - it could be that our application is unbelievably architecturally broken, if that's the case I'd like to know!
"thread" has the narrow meaning of a "Python thread"
(see the Python documentation for its modules `_thread` and `threading`).
The coroutines associated with "async/await" are not threads
in this sense.
By default, the ZODB targets a multi-thread application
and makes assumptions about how threads are using ZODB connections:
* each thread must use its own ZODB connection -- a single
ZODB connection (and its objects) cannot be shared by different
threads
* a transaction must be controlled by an individual thread --
i.e. the commit in one thread should not cause the commit
of modifications performed in a different thread.
The coroutines associated with "async/await" are similar to threads.
However, they, too, must satisfy the conditions formulated above
for threads if they want to use the ZODB safely.
Especially, they must use a special transaction manager
which binds the transactions to coroutines rather than threads.
|
Matthew Christopher wrote at 2020-8-26 15:30 -0700:
...
I took a deeper look at the difference in history between the last good history entry and the first bad one and what I observed was that we removed keys from 4 buckets between the two entries.
...
The net of this is that basically we removed 5 keys from 4 buckets, with the key in bucket 159 being the last key. That bucket (oid `b'\x00\x00\x00\x00\x00\x00\r\x91'`) didn't disappear like it was (I think) supposed to.
You are right. The bucket should have been removed from the tree.
Your problem may have a stronger relation to #68
as originally assumed:
even for the good tree (which still can be iterated over)
`btree._check()` reports `AssertionError: Bucket next pointer is damaged`.
Thus, your observed problem (bucket not removed) might be
a consequence of an earlier bug (damaged bucket chain).
The ZODB containing the BTree in question is [in this zip](https://github.com/zopefoundation/BTrees/files/5132990/db.zip).
There are multiple BTrees in the DB: The one with the issue is: `btree = conn.root()['agent.task.history'].history`
I've posted the code I used to gather this data [here](https://gist.github.com/matthchr/3c90b2b13c871b1407fb233e5ee2f6af).
For those who want to join the analysis, I used the following
code:
```
from pprint import pprint as pp
from ZODB.DB import DB
from ZODB.FileStorage import FileStorage
s=FileStorage("agent.db")
db=DB(s)
c=db.open()
o=c.root()['agent.task.history']
o._p_activate()
t=o.__Broken_state__["history"]
from dm.historical import *
b=c[b'\x00\x00\x00\x00\x00\x00\r\x91']
bad = getObjectAt(t, b'\x03\xda\x0bj4\xfd\x99\xee') # no longer iterable
good = getObjectAt(t, b'\x03\xda\x0bj4\xfd\x99\xe0') # still iterable
from BTrees.check import *
top=crack_btree(bad, True)
bb_p = top[2][157] # predecessor of bad bucket
bb = top[2][158] # bad bucket
good._check()
```
|
This is checked (among others) by the |
Dieter Maurer wrote at 2020-8-28 08:42 +0200:
Matthew Christopher wrote at 2020-8-26 15:30 -0700:
> ...
Your problem may have a stronger relation to #68
as originally assumed:
even for the good tree (which still can be iterated over)
`btree._check()` reports `AssertionError: Bucket next pointer is damaged`.
Thus, your observed problem (bucket not removed) might be
a consequence of an earlier bug (damaged bucket chain).
I have analysed this further - and indeed, the "bucket not removed"
is likely a direct consequence of the damaged bucket chain:
almost surely, a different bucket got removed (the one in the chain
at the position of the emptied bucket).
I used the following code (following code previously posted):
```
gt=crack_btree(good, True)
for i in range(len(gt[1])):
if gt[2][i].__getstate__()[1] is not gt[2][i+1]: print(i); break
```
The result was `157`; this is the position before the emptied bucket
which was not removed (as it should have been).
|
Dieter Maurer wrote at 2020-8-28 09:28 +0200:
Dieter Maurer wrote at 2020-8-28 08:42 +0200:
>Matthew Christopher wrote at 2020-8-26 15:30 -0700:
>> ...
>Your problem may have a stronger relation to #68
>as originally assumed:
>even for the good tree (which still can be iterated over)
>`btree._check()` reports `AssertionError: Bucket next pointer is damaged`.
>Thus, your observed problem (bucket not removed) might be
>a consequence of an earlier bug (damaged bucket chain).
I have analysed this further - and indeed, the "bucket not removed"
is likely a direct consequence of the damaged bucket chain:
almost surely, a different bucket got removed (the one in the chain
at the position of the emptied bucket).
I used the following code (following code previously posted):
```
gt=crack_btree(good, True)
for i in range(len(gt[1])):
if gt[2][i].__getstate__()[1] is not gt[2][i+1]: print(i); break
```
The result was `157`; this is the position before the emptied bucket
which was not removed (as it should have been).
The "damaged bucket chain" was introduced by the following
transaction:
```
{'description': '',
'obj': <BTrees.OOBTree.OOBTree object at 0x7f8b3665e158 oid 0x8 in <Connection at 7f8b366cdeb8>>,
'size': 2961,
'tid': b'\x03\xda\x0bi 2Wf',
'time': DateTime('2020/08/20 01:37:7.546089 GMT+2'),
'user_name': ''}
```
This is history record "5186" in the reversed history
obtained from `dm.historical.generateBTreeHistory(t)`.
The previous transaction (where the bucket chain is still
okay) is:
```
{'description': '',
'obj': <BTrees.OOBTree.OOBTree object at 0x7f8b3665e1e0 oid 0x8 in <Connection at 7f8b366cdf60>>,
'size': 2661,
'tid': b'\x03\xda\x0bi\x02\xa9\xce\xbb',
'time': DateTime('2020/08/20 01:37:0.624213 GMT+2'),
'user_name': ''}
```
In the bad transaction, the bucket following the bucket with
the bad `nextbucket` pointer was emptied, the `nextbucket` pointer
of the previous bucket was correctly updated (and now points
to the bucket following the emptied bucket) **BUT**
the emptied bucket was not removed from the parent btree.
I have looked at the code: reponsible is
`BTreeTemplate.c:_BTree_set`. While the bucket chain update
is quite far away from the `data` update, I have not seen
how it could be possible to "forget" the `data` update.
I will see whether I can reproduce the behavior on my linux platform.
|
I was unable to emulate the application specific modules/classes -- a requirement for the reproduction on a foreign system. @matthchr can you provide such an emulation (or even the true code)? |
The following code checks deletion of a bucket:
No "damaged bucket chain" -- at least not under Linux. |
@d-maurer thanks for the in-depth investigation. I hadn't realized that One thing I noticed when walking through the history myself is that my
I ran the code snippet you shared on my Windows environment and didn't see any issue either.
I haven't been able to produce a paired down example which can emulate this problem, and unfortunately I can't share the actual code in question. Before we get too far, we should revisit this:
We are not using a coroutine specific transaction manager when we open connections - I was (mistakenly?) under the assumption that since we weren't actually multi-threaded, we could just use the default transaction manager. As you mentioned, that defaults to a transaction per-thread - since we only have one thread there is at most a single transaction ongoing at a given time. Can you explain a bit more about the mechanics of how this is problematic? I am failing to understand what the difference is in these two pieces of code from the perspective of ZODB/BTrees:
Obviously this example is overly simplistic because it assumes that execution in an async context is always well-ordered, which it's not. Even with a more complicated case where we're reacting to some outside event (a web request or something), such that there may be multiple calls ongoing and order is not guaranteed, I still am not seeing how that's any different than making a bunch of sequential changes and commits in a non-async context. One issue (which I do understand), is that if we do something like this:
and have the expectation that both part 1 and part 2 will always be committed together, that is not correct (because we awaited in the middle and something else could have committed during that await). I don't understand how this could corrupt the btree though (because as far as ZODB/BTrees is concerned we've just done a bunch of sequential commits). Can you clarify if you think that our lack of coroutine-specific transactions is the cause of this corruption, or if you are just cautioning against that because of possible logic errors? |
Matthew Christopher wrote at 2020-8-28 11:37 -0700:
@d-maurer thanks for the in-depth investigation. I hadn't realized that `_check` did something different than `BTrees.check.check`
One thing I noticed when walking through the history myself is that my `dm.historical` is showing me far fewer history entries than you seem to be seeing. You mentioned seeing 5k+, but I only see ~500.
I used `generateBTreeHistory` (which contained a Python 3 incompatibility --
now fixed with new version 2.0.2).
Note that each persistent object has its own history - independent that
of persistent subobjects. `getHistory` (and `generateHistory`)
accesses this history. `generateBTreeHistory` tries to take
the history of all persistent `BTree` and `Bucket` subobjects into
account (but it, too, can miss some history records).
What version of `dm.historical` are you using?
2.0.2 (released today).
I'm using version `2.0.1`, and I think that this version is missing certain history entries. I wrote a simple test where I just commited 2000 transactions (each labeled with a number), and while the first 30 or 40 it had each one of, after that it seemed to skip a bunch between each history entry (i.e. one entry would be txn 67, the next 93).
When you made your test with a `BTree`, then initially there is
a single persistent object - and `getHistory` gives you the full
history. When the tree become larger, then it grows persistent subobjects.
Modifications to those subobjects (alone) are not reflected in
the history of the tree - only modifications of the (top level) tree
structure show in the trees history.
`generateBTreeHistory` is there to take modifications of sub tree
structures into account.
> No "damaged bucket chain" -- at least not under Linux.
I ran the code snippet you shared on my Windows environment and didn't see any issue either.
I feared that: `BTrees` comes with extensive tests
which are run on Windows as well. Almost surely, there
is a test which tests bucket deletion.
> I was unable to emulate the application specific modules/classes -- a requirement for the reproduction on a foreign system. @matthchr can you provide such an emulation (or even the true code)?
I haven't been able to produce a paired down example which can emulate this problem, and unfortunately I can't share the actual code in question.
I do not need the real code - just the relevant module/class
structure. Below is the emulation I have tried:
```
from types import ModuleType
m = ModuleType("agent.task.history")
m.task = m.history = m.defs = m
from sys import modules
modules[m.__name__] = modules["agent"] = modules["agent.task"] = modules["agent.task.defs"] = m
from persistent import Persistent
class TaskHistoryInfo(Persistent): pass
TaskHistoryInfo.__module__ = "agent.task.history"
m.TaskHistoryInfo = TaskHistoryInfo
class TaskHistoryData(Persistent): pass
TaskHistoryData.__module__ = "agent.task.history"
m.TaskHistoryData = TaskHistoryData
class TaskKind(object): pass
TaskKind.__module__ = "agent.task.history"
m.TaskKind = TaskKind
```
It is unfortunately not sufficient: trying to copy the last
correct tree gives an `object() takes no arguments`. Likely, the
base class of one of the above classes is wrong.
Before we get too far, we should revisit this:
> This is potentially problematic -- unless you ensure
that each coroutine uses its own ZODB connection
and the transactions are coroutine specific.
Bugs of this kind should not damage the physical structure
of the tree. In your case, the physical structure is damaged:
for some (still unknown) reason the bucket chain was updated
but not the actual tree. Whatever you do (wrong) at
the application level, you should not be able to achieve this.
The code can skip the tree update -- but only in case of errors
and those should result in an exception. Maybe, under some rare
cases (triggered by your tree) an error occurs which (wrongfully)
is not transformed into an exception. Debugging will be necessary
to find this out.
...
Can you explain a bit more about the mechanics of how this is problematic?
An important ZODB aim is to make multi-thread applications (fairly)
safe without the need of explicit locking.
It uses two techniques for this: "transaction"s and local copies.
Each connection has (logically) its own copies of the objects in the ZODB.
Modifications affect the copies, not the persistent data directly.
Only when the transaction is committed, modifications are transfered
from the local copies to the persistent data.
When several threads share the same transaction, then the transaction
commit or abort stores or rolls back not only the modifications
of a single thread but of different threads - and usually, the
thread issuing the transaction control does not know about the state
of other threads. As a consequence, modifications may be persisted
(or reverted) in an uncontrolled way.
When a single connection is shared by several threads, they
operate on the same local copies and can interfere with one another.
What I wrote about threads above applies also to coroutines
(or other concurrent activities).
I am failing to understand what the difference is in these two pieces of code from the perspective of ZODB/BTrees:
```
async def test(btree):
await t1(btree)
await t2(btree)
await t3(btree)
transaction.commit()
async def t1(btree):
btree['foo'] = 5
async def t2(btree):
btree['bar'] = 6
async def t3(btree):
btree['baz'] = 7
```
```
def test2(btree):
btree['foo'] = 5
btree['bar'] = 6
btree['baz'] = 7
transaction.commit()
```
Obviously this example is overly simplistic because it assumes that execution in an async context is always well-ordered, which it's not. Even with a more complicated case where we're reacting to some outside event (a web request or something), such that there may be multiple calls ongoing and order is not guaranteed, I still am not seeing how that's any different than making a bunch of sequential changes and commits in a non-async context.
Lets see:
Assume, the "await t2(btree)" causes a coroutine switch and the
foreign coroutine calls `transaction.abort()`. Then the effect
of `t1(btree)` will be undone - a thing completely unexpected by `task`.
The ZODB's aim it to ensure that in a single thread/coroutine
you can forget that there are other threads/coroutines.
If the transactions are not thread/coroutine specific, this is
obviously no longer the case.
One issue (which I _do_ understand), is that if we do something like this:
```
def a():
# modify some btree state (part 1)
await <something>
# modify some more btree state (part 2)
transaction.commit()
```
and have the expectation that both part 1 and part 2 will always be committed together, that is not correct (because we awaited in the middle and something else could have committed during that await). I don't understand how this could corrupt the btree though (because as far as ZODB/BTrees is concerned we've just done a bunch of sequential commits).
It should not be possible to corrupt the physical btree structure
(unless a coroutinee switch would happen as part of object destruction
-- I believe that this is impossible).
Can you clarify if you think that our lack of coroutine-specific transactions is the cause of this corruption,
I do not think so.
Not having coroutine specific transactions (or share connections
between coroutines) may give you
inconsistencies at the application level. It should not be able
to cause physical damage to the btree structure itself
(because there should be no coroutine switch during a single btree
operation only between btree operations).
|
Matthew Christopher wrote at 2020-8-28 11:37 -0700:
...
Can you clarify if you think that our lack of coroutine-specific transactions is the cause of this corruption, or if you are just cautioning against that because of possible logic errors?
You issue is almost surely related to #68
-- and that almost surely has been observed in a standard
multi threaded application.
What you do is not right -- but it is not the cause of the observed
damage.
|
That's not out of the realm of possibility, though, is it? If the objects you're using as keys wind up switching in their (I'm far more familiar with gevent than with async/await. This is one place that the explicitness of async/await comes in handy. It's probably tough to arrange for switches in those methods without some trouble and several layers. )
In fact, it's actually a gevent-based application, but it's using almost exclusively the standard-library as monkey-patched by gevent, and greenlets are treated just like threads: one Connection and transaction per request per greenlet. Persistent objects are never shared between "threads" of control without a great deal of care and following some rules (rule #1: Don't do that!) At least they're not supposed to be, so after this conversation I'm wondering if that might have been the root cause of #68. |
@d-maurer Thanks for your explanation - I understand now. @jamadden AFAIK it's not possible to actually yield control from a coroutine without an
Thanks for the clarification - I will look at making changes to use a connection/txn per-coroutine (I agree that there are clear flaws with what we're doing now, although we're managing the application inconsistencies that may arise from it), but good to hear that it shouldn't be causing corruption.
In that case, I may be able to help. Can you share the full snippet of code that you're trying to execute (with the mock object hierarchy + copy + usage of dm.historical?) If I get that I can probably fix it up and share it back with you if I can get it working |
Yeah, it wouldn't be directly done, it would probably be something buried under layers of function calls. This simple example shows control going to different coroutines while inside of an import asyncio
from BTrees import family64
async def coro2():
print("In coro 2")
print("Len of bt", len(bt))
async def thing():
print("-> Enter coro 1")
await coro2()
print("<- Exit coro 1")
def do_thing():
asyncio.run(thing())
class Foo:
def __eq__(self, other):
print("In __eq__")
import traceback; traceback.print_stack()
do_thing()
return True
__lt__ = __eq__
bt = family64.OO.BTree()
bt[Foo()] = 1
bt[Foo()] = 2 $python /tmp/atest.py
In __eq__
File "/tmp/atest.py", line 28, in <module>
bt[Foo()] = 2
File "/tmp/atest.py", line 20, in __eq__
import traceback; traceback.print_stack()
-> Enter coro 1
In coro 2
Len of bt 1
<- Exit coro 1
$ PURE_PYTHON=1 python /tmp/atest.py
In __eq__
File "/tmp/atest.py", line 28, in <module>
bt[Foo()] = 2
File "//lib/python3.8/site-packages/BTrees/_base.py", line 819, in __setitem__
self._set(self._to_key(key), self._to_value(value))
File "//lib/python3.8/site-packages/BTrees/_base.py", line 949, in _set
result = child._set(key, value, ifunset)
File "//lib/python3.8/site-packages/BTrees/_base.py", line 350, in _set
index = self._search(key)
File "//lib/python3.8/site-packages/BTrees/_base.py", line 125, in _search
if k is key or k == key:
File "/tmp/atest.py", line 20, in __eq__
import traceback; traceback.print_stack()
-> Enter coro 1
In coro 2
Len of bt 1
<- Exit coro 1 |
Matthew Christopher wrote at 2020-8-28 15:18 -0700:
...
> I do not need the real code - just the relevant module/class structure.
In that case, I may be able to help. Can you share the full snippet of code that you're trying to execute (with the mock object hierarchy + copy + usage of dm.historical?) If I get that I can probably fix it up and share it back with you if I can get it working
The idea is to copy the (historical) last good tree into
a current connection and then perform the deletion.
If this deletion reproduces the problem, it can be debugged.
Below is the trial code up to the copying.
You can check whether an updated emulation is sufficient
by executing the code outside your normal environment
(to ensure that not the real `agent` code is used)
and verify that there is no exception.
from pprint import pprint as pp
from ZODB.DB import DB
from ZODB.FileStorage import FileStorage
from transaction import commit
from types import ModuleType
m = ModuleType("agent.task.history")
m.task = m.history = m.defs = m
from sys import modules
modules[m.__name__] = modules["agent"] = modules["agent.task"] = modules["agent.task.defs"] = m
from persistent import Persistent
class TaskHistoryInfo(Persistent): pass
TaskHistoryInfo.__module__ = "agent.task.history"
m.TaskHistoryInfo = TaskHistoryInfo
class TaskHistoryData(Persistent): pass
TaskHistoryData.__module__ = "agent.task.history"
m.TaskHistoryData = TaskHistoryData
class TaskKind(object): pass
TaskKind.__module__ = "agent.task.history"
m.TaskKind = TaskKind
s=FileStorage("agent.db")
db=DB(s)
c=db.open()
t=c.root()['agent.task.history'].history
from dm.historical import *
th=generateBTreeHistory(t)
h = list(th); h.reverse()
good = h[5185]["obj"]; bad = h[5186]["obj"]
from tempfile import TemporaryFile
with TemporaryFile() as f:
good._p_jar.exportFile(good._p_oid, f)
f.seek(0)
gc = c.importFile(f)
commit()
|
Jason Madden wrote at 2020-8-28 16:08 -0700:
> AFAIK it's not possible to actually yield control from a coroutine without an await, and I am not aware of a way to use that keyword in a method such as __eq__, __lt__, etc. Certainly we aren't doing that.
Yeah, it wouldn't be directly done, it would probably be something buried under layers of function calls. This simple example shows control going to different coroutines while inside of an `__eq__` method.
I do not think that anyone will run the event loop nested in
a special method. As for "normal" (`async`) coroutines, they need to be
specially compiled. I believe that this excludes to be called/awaited
indirectly from a C library not specially equipped to support this
(such as `BTrees`): it would mean that the C runtime stack needs
to be saved and later restored.
|
Jason Madden wrote at 2020-8-28 14:23 -0700:
> because there should be no coroutine switch during a single btree operation only between btree operations
That's not out of the realm of possibility, though, is it?
My current working hypothesis:
`_Btree_set` hits an error between the bucket chain
update and the children (aka `data`) update. The error
causes the children update to be skipped. For some
(unknown) reason, the error does not cause an exception.
The error might be special for the asynchronous execution environment
(to explain why the problem has not popped up previously).
We know that Python suppresses some exceptions (e.g.
exceptions in `__del__`); maybe our observation is related
to exception suppression.
|
Matthew Christopher wrote at 2020-8-28 15:18 -0700:
...
In that case, I may be able to help.
Rethinking, it may be better that you try to reproduce the error --
the big advantage: you can use the real code (no need for an emulation).
Following the code posted previously (without the (incomplete) emulation code),
you copy the last good tree. Comparing the last good and first bad
tree, you determine which key you must delete in the tree copy
to (hopefully) reproduce the damage.
|
I've fixed up your emulation code I believe. The main issue was that Full code that ran without error for me:
I tried investigating how you suggested (diffing the two trees to see what changed). I used the following function:
From what I can tell, The 5185 -> 5186 BTree diff consisted of the following change:
I tried doing the following which didn't seem to trigger any errors:
I tried on my Linux box though - I will try on my Windows box too and see if there is any difference. |
Matthew Christopher wrote at 2020-9-3 17:01 -0700:
...
I tried doing the following which didn't seem to trigger any errors:
```
with TemporaryFile() as f:
good._p_jar.exportFile(good._p_oid, f)
f.seek(0)
gc = c.importFile(f)
gc._check()
gc.pop('testaccount1batchtest18 22F1F921291938C1$abs-2-job406589010-4 22F1E3E288A8B2FD$job-1$abs-2-job406589010-4-task-182$0:662512005')
gc._check()
```
I tried on my Linux box though - I will try on my Windows box too and see if there is any difference.
When you try on Windows, please use the real code (not the emulation):
it might be that peculiarites of the real code trigger the error
(those would likely be related to destructors).
When you cannot reproduce the problem than the error does not
depend only on the data; something dynamic must be involved as well --
maybe a `gc` run or a context switch (even though I do not see how
the latter could happen).
|
I tried on my Windows machine and actually I am getting an error here (I believe on the
The error is:
Any idea why that would be? |
Matthew Christopher wrote at 2020-9-4 11:56 -0700:
I tried on my Windows machine and actually I am getting an error here (I believe on the `list(th)`)
```
th=generateBTreeHistory(t)
h = list(th); h.reverse()
```
The error is:
```
C:\work\venv-test\Scripts\python.exe C:/work/src/temp2.py
Traceback (most recent call last):
File "C:\work\venv-test\lib\site-packages\dm\historical\__init__.py", line 60, in generateHistory
if not history: raise StopIteration()
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:/work/src/temp2.py", line 53, in <module>
h = list(th); h.reverse()
File "C:\work\venv-test\lib\site-packages\dm\historical\__init__.py", line 129, in generateBTreeHistory
if a.next() is None: heappop(heap)
File "C:\work\venv-test\lib\site-packages\dm\historical\__init__.py", line 100, in next
try: value = next(self.gen)
RuntimeError: generator raised StopIteration
```
Any idea why that would be?
An incompatibility of `dm.historical` with newer Python versions.
The error does not occur with Python 3.6, it occurs with Python 3.9
(and may have been introduced with a Python version in between).
I have released `dm.historical==2.0.3` which fixes this incompatibility.
|
Python 3.7 adopted PEP 479 "meaning that StopIteration exceptions raised directly or indirectly in coroutines and generators are transformed into RuntimeError exceptions." |
@jamadden has it -- I was on 3.6 on my Linux but 3.7 on Windows. Even with
I reverted to Python 3.6 and everything passed in Windows as well, so it looks like probably no reliable repro... That doesn't necessarily surprise me as this issue has been really hard for us to pin down as well (our first thoughts were things like threading issues, etc)... Are there next steps that we could take to try to pin down what's going on here? |
Matthew Christopher wrote at 2020-9-4 14:52 -0700:
@jamadden has it -- I was on 3.6 on my Linux but 3.7 on Windows.
Even with `dm.historical==2.0.3` I am still getting an error with 3.7 - @d-maurer are you sure you've fixed it in 2.0.3?
There has been another occurrence of the same problem. Fixed in 2.0.4.
...
I reverted to Python 3.6 and everything passed in Windows as well, so it looks like probably no reliable repro... That doesn't necessarily surprise me as this issue has been really hard for us to pin down as well (our first thoughts were things like threading issues, etc)...
Are there next steps that we could take to try to pin down what's going on here?
Have you made the Windows trial with the emulation or your real
code? In the first case, you could retry with the real code.
The hypothesis: maybe one of the destructors (they can be executed
inside the `BTrees` C code) does something strange (e.g. access the
tree currently under modification).
If this cannot reproduce the problem, then it seems to be truely
dynamic -- which makes the analysis far more difficult.
An analysis approach could come from the following thoughts:
your bucket deletion (implemented by `_BTree_set` in `BTreeTemplate.c`)
consists of two phases: 1. removal from the bucket chain (line 917),
2. removal from the (inner node) parent (line 962ff).
The broken tree analysis indicates that the first phase was successful
but the second phase did not happen (or at least was not made persistent).
This suggests that something unexpected happened between line 917
and line 962. This unexpected could be an error condition or the execution
of (unexpected) application code initiated via the deletion of an object.
The only things deleted in your case between the respective lines
are the empty bucket itself and the key (which in your case
seems to be a string) -- both should not have a destructor in
your case. This would leave the error condition. One could run
the application inside a debugger and put a breakpoint on the error exit
and then try to reproduce the problem. When it occurs, hopefully,
the breakpoint is hit and one can (again hopefully) determine what caused
the error.
|
I tried both -- our real code doesn't define any special destructors (but I tried it anyway).
I'm investigating two angles right now:
For 1, is there a document or instructions on performing this process? It would have to be in Windows (the only environment I can get a repro). I am somewhat familiar with windbg, but have never debugged a native Python module loaded by Python before. How can I get the symbols for the native code - are they included in the |
Matthew Christopher wrote at 2020-9-15 17:16 -0700:
...
I'm investigating two angles right now:
1. Running the application under a debugger and placing some breakpoints when I get a repro.
2. Coming up with a more minimal repro that can be shared.
For 1, is there a document or instructions on performing this process?
This depends on the debugger used. I have used `gdb` in the past
for similar purposes. For `gdb`, their is a set of commands
which facilitates the analysis of Python objects from `C` level
(e.g. print the Python stack trace, a Python object, etc.).
`BTrees` debugging is particularly difficult due to the extensive use
of (`C` preprocessor) macros. I tackled this problem is the past
by running the preprocessor stand alone, remove the line information
introduced by the preprocessor and use the result as new source file.
This ensures that you can put a breakpoint at a precise place
(your debugger might not need this kind of preprocessing).
It would have to be in Windows (the only environment I can get a repro). I am somewhat familiar with windbg, but have never debugged a native Python module loaded by Python before. How can I get the symbols for the native code - are they included in the `.whl` for Windows?
This is quite straight forward with `gdb` (I do not know `windbg`).
Even with optimizations active, Python is typically compiled to
retain debugging symbols (should this not be the case, you need
to compile Python yourself). Extensions, such as `BTrees`,
typically are compiled as Python was compiled. Thuse, they, too,
should have debugging symbols.
With `gdb` (under Linux), I could:
* start Python under the debugger
(or even "attach" a running Python process -- this nowadays
requires special OS level configuration under Linux)
* import the C extension (debugging symbols are made available
at this point, if the extension has them)
* use Ctrl-C to enter `gdb`
* install the necessary breakpoints
* "continue" to return to Python
* run the Python code
|
We were able to get a minimal repro of this issue and determine why we were only seeing the issue in Windows. You can run the repro for yourself, I've posted the code on GitHub here. The problem also repros in Linux. The key details (this is basically contained in the readme of that repo as well):
Because there is no issue with the native version of BTrees we've been able to work around this problem by ensuring that we're using the native version of BTrees. |
Matthew Christopher wrote at 2020-11-6 10:37 -0800:
We were able to get a minimal repro of this issue and determine why we were only seeing the issue in Windows. You can run the repro for yourself, I've posted the code on GitHub [here](https://github.com/matthchr/btrees-corruption-repro).
Congratulations!
|
I was able to isolate (and quickly reproduce) the problem. The code below depends on your
This means that the problem is not caused by the deletion operation itself. Instead, the tree structure gets broken via storing/reloading -- almost surely due to a missing And indeed: there are 2 missing
With these additions, the problem disappears. |
The first |
Versions:
Python 3.6
I have a
BTrees.OOBTree.BTree
in a ZODB which seems to have somehow become corrupted.While our system was running we observed that the data the BTree was returning seemed suspicious (while iterating the tree, items which should have been there were not actually there). We took a snapshot of the tree and manually examined it, and determined that the structure of the BTree seems to have become corrupted.
When we try to iterate the tree (or its keys):
We get this error:
Interestingly, as far as I can tell our live service was not getting this exception (but it definitely was also not returning all of the data it should have been) - I'm not sure why the discrepancy between what our live service was seeing with this DB versus what I see when I download it and look through it locally.
BTrees.check.check
shows no issues.Obviously, I am not modifying the tree while iterating over it in my debugging environment - but when I look deeper into the structure of the tree by iterating the buckets one by one I see this:
From what I can tell, 0 size buckets in the "middle" of the bucket list is not expected and is what triggers the exception I'm getting. Additionally if you look closely, it seems like a large swath of buckets is lost, because there should be a relatively even distribution of GUIDs as keys (and I see that from first digit 0-4 and first digit 9-f), but first digit 5-8 is missing.
My questions are:
BTrees.check.check
notice this?It's possible that this is related to #68, although they look sufficiently different to me that I figured I would post a new issue. It is also possible that this only started happening in
BTrees==4.6.0
as we moved to that version relatively recently (we were onBTrees==4.5.1
previously) - but given how rare this is occurring we may just not have noticed it until now...The text was updated successfully, but these errors were encountered: