-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-115832: Fix instrumentation version mismatch during interpreter shutdown #115856
Conversation
…ter shutdown In python/cpython@0749244d13412d, I introduced a bug to `interpreter_clear()`: it sets `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and [this version check in bytecodes.c](https://github.com/python/cpython/blob/4ee6bdfbaa792a3aa93c65c2022a89bd2d1e0894/Python/bytecodes.c#L147-L152) will see a different result than [this one in instrumentation.c](https://github.com/python/cpython/blob/4ee6bdfbaa792a3aa93c65c2022a89bd2d1e0894/Python/instrumentation.c#L894-L895), causing an infinite loop. The fix itself is straightforward, and is what I should've done in `interpreter_clear()` in the first place: also clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`. I also restored a comment that I'm not sure why I deleted in the original commit. To make bugs of this type less likely in the future, I changed `instrumentation.c:global_version()` to read the version from a `PyThreadState*` rather than a `PyInterpreterState*`, so it's reading the version from the same location as the interpreter loop. This had some fan-out effects on its transitive callers, although most of them already had the current tstate availale. - Issue: pythongh-115832
I think this one can skip news (since it should've been part of the original commit), but let me know if anyone disagrees. |
The test failures look real and related to this change. I'm investigating. |
…nce it doesn't have an up-to-date monitoring version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good to me, but there are some merge conflicts that need to be resolved.
I am a bit unsure about the switch of global_version
to use the PyThreadState
's eval_breaker
instead of the interpreter's instrumentation_version. The use of instrumentation_version
seemed a bit more natural as the authoritative source, and a smaller bug fix change seems preferable too.
Would it be possible to add an assertion in global_version()
that the two are the same? With the GIL, the setting of the thread and interpreter values should appear atomic. With the GIL disabled, I think we'd want a stop-the-world pause when we enable instrumentation.
Makes sense, and yeah that should be a straightforward change. |
The latest round of merge conflicts was from #116013, but this should be good to go again. |
🤖 New build scheduled with the buildbot fleet by @colesbury for commit 7a9c81a 🤖 If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
…ter shutdown (python#115856) A previous commit introduced a bug to `interpreter_clear()`: it set `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and the version check in bytecodes.c will see a different result than the one in instrumentation.c causing an infinite loop. The fix itself is straightforward: clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`.
…ter shutdown (python#115856) A previous commit introduced a bug to `interpreter_clear()`: it set `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and the version check in bytecodes.c will see a different result than the one in instrumentation.c causing an infinite loop. The fix itself is straightforward: clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`.
…ter shutdown (python#115856) A previous commit introduced a bug to `interpreter_clear()`: it set `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and the version check in bytecodes.c will see a different result than the one in instrumentation.c causing an infinite loop. The fix itself is straightforward: clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`.
In 0749244d13412d, I introduced a bug to
interpreter_clear()
: it setsinterp->ceval.instrumentation_version
to 0, without making the corresponding change totstate->eval_breaker
(which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and this version check in bytecodes.cwill see a different result than this one in instrumentation.c, causing an infinite loop.
The fix itself is straightforward, and is what I should've done in
interpreter_clear()
in the first place: also cleartstate->eval_breaker
when clearinginterp->ceval.instrumentation_version
. I also restored a comment that I'm not sure why I deleted in the original commit.To make bugs of this type less likely in the future, I changed
instrumentation.c:global_version()
to read the version from aPyThreadState*
rather than aPyInterpreterState*
, so it's reading the version from the same location as the interpreter loop. This had some fan-out effects on its transitive callers, although most of them already had the current tstate available.