-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-126644: Fix various thread safety issues in _interpreters
#126696
gh-126644: Fix various thread safety issues in _interpreters
#126696
Conversation
…running-thread-safety
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these thread-safety issues are not limited to the free threading build. The GIL does not really protect most of these cross-interpreter checks, like _PyInterpreterState_IsRunningMain
, because different subinterpreters can concurrently try to run or destroy each other.
I also think it's better to try to fix these issues robustly rather than make a minimal PR that can be backported to 3.13. More generally, I'm not convinced it's worth the risk to backport non-trivial fixes to issues that solely come up when using the combination of free-threading + subinterpreters.
/* _Py_ReserveTLBCIndex has thread safety issues */ | ||
HEAD_LOCK(runtime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right. _PyIndexPool_AllocIndex locks internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's what I thought too. I'm not too sure what's going on here, but this segfaults without this lock.
Modules/_interpretersmodule.c
Outdated
return NULL; | ||
} | ||
|
||
if (_Py_atomic_load_ptr_relaxed(&interp->threads.head) != NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand what you're doing here, but access to the linked list of thread states requires holding HEAD_LOCK()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is when _PyInterpreterState_PreventMain
was able to set the prevention flag, but there was a thread that was able to set the main thread right before that happened, so we can't destroy the interpreter while that thread is still alive (but no more threads will be able to set main after its done).
Hmm, why? This will hurt PEP 734, because Eric's PyPI package for backporting features is targeted towards 3.13+, not 3.14+. I'm not sure how many people are using 3.13t as their working copy, but I would assume that the people who are using it are also the people who would like to use subinterpreters to get around the GIL as well. |
|
Judging by the failing tests, I see your point :) What do you suggest we do for 3.13? Slap a big
Yeah, I'm pretty sure that the cases I've fixed in this PR are truly fixed, but the test case isn't extensive at all. I'm doing my best to try and cover the bases that aren't thread-safe here, but as far as I can tell, interpreter switching without waiting on the GIL is a gray area. There are things that work right now that shouldn't, and I'm not totally sure why. (I'm not even addressing the C API here, that's probably still a mess--this change is assuming that subinterpreters created by |
@colesbury FWIW, I was being dumb. The exception problem is almost certainly present on the default builds (I said that in the original issue, but not in the description for whatever reason). I'm assuming the same goes for the Apparently, there's some sort of race on both builds when creating thread states that's been screwing up my reproducers (I get hit with a from threading import Thread
import _interpreters
interp = _interpreters.create()
def run():
this_interp = _interpreters.create()
_interpreters.run_string(this_interp, f"import _interpreters; _interpreters.run_string({interp}, '1')")
threads = [Thread(target=run) for _ in range(1000)]
for thread in threads:
thread.start()
for thread in threads:
thread.join() As far as I can tell, moving |
I'm going to split this into smaller PRs so we can address reviews and backporting on a case-by-case basis. |
This PR fixes a few things:
_PyXI_ERR_ALREADY_RUNNING
currently relies on the interpreter state after the thread state has already been detached, so it's possible, at least on the free-threaded build, that another thread stops the interpreter and breaks theFailIfRunningMain
check. This fixes that by just raising the exception manually._interpreters.destroy
doesn't expect another thread to start the interpreter after its initial check, so this adds a flag on the interpreter state that prevents future threads from starting after destruction has started.BINARY_OP
#123926, so this just slaps an uglyHEAD_LOCK
on it for now.It would be nice if we didn't have to throw an exception for everything and instead just make it thread-safe, but that would be too big of a PR to backport to 3.13.
_interpreters
is not thread safe on the free-threaded build #126644