-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-123672: Clarify the usage of PyGILState*
for subinterpreters
#123728
Conversation
Co-authored-by: Bénédikt Tran <[email protected]>
Co-authored-by: Petr Viktorin <[email protected]>
OK, as far as I can see, |
Co-authored-by: Bénédikt Tran <[email protected]>
Keep in mind that you're writing the specification here; basing it on the current implementation is not necessarily the right thing to do. The implementation might be buggy, and some parts of it aren't thought out and tested. Be careful when promising that the API does something. The current docs say “mixing multiple interpreters and the I don't think we can simply allow mixing the two APIs without any change to the code & tests. Would you like to design this functionality, and the API for it? (Careful, I picked up a similar task 10 years ago and it basically took over my life...)
The term we use for “global interpreter” is the “main” interpreter. (And in my experience, any time we make the main interpreter special, except runtime init/teardown, we're creating technical debt.) Activating the “main” interpreter by default is dangerous behavour: it's only valid in extensions that originally run in the “main” interpreter. And perhaps Multi-phase initialization docs, which currently say
... should specifically note that it's incompatible with |
I'm mostly sure I fixed the situation with PyGILState_Ensure() and subinterpreters for 3.12 by moving us to a thread-local for the current thread state. I didn't think to update the docs. Of course, we should verify my assertion first. (I'll reply further when I have a chance next week; this is a busy week of family stuff for me.) |
Oh! Though, it seems to work right now without issue. Is this something that's sort of an accident? If so, would it be that difficult to stabilize? (I'm guessing we just need to add some tests, and of course document it if that's the case.)
Is there all that much we need to design? The easy solution might just be to force In regards to this PR, maybe we should just document that if you use |
Co-authored-by: Savannah Ostrowski <[email protected]>
OK, so I just talked to Eric and I understand more about the issue now :) The fix for Here's a use case that Eric has not considered, which we need to test and document, and possibly add new API for:
In this case, we must somehow ensure that the IMO, the best way forward is:
|
I played around with this case last week, and it always gives you the main interpreter (including when subinterpreters are active). I think that's worth documenting in this PR. Maybe we should add some tests too?
There is! PyInterpreterState_GetID() What we don't have is an API for getting a
I think we do somewhat support this right now with typedef struct {
PyThreadState *tstate;
PyObject *obj;
} transport;
int
func(transport thing)
{
PyGILState_Ensure(); // Main interpreter, ready to call Python
PyThreadState_Swap(thing.tstate);
// Do something with obj in the subinterpreter
// ??? PyGILState_Release() or PyThreadState_Clear?
}
int
thread_func(void *whatever)
{
// Py_NewInterpreterFromConfig and whatnot
PyObject *obj = Py_Something();
transport thing = { PyThreadState_Get(), obj }
call_in_another_thread(func, thing);
} Though, if the thread state is supposed to be a thread local, then I think that would cause problems upon finalization. I'll test it tomorrow. If it doesn't work, then we need an API (I'll make a new issue for that) -- maybe |
Ok, I've tested switching interpreters with In regards to this PR, does this look good in terms of clarifying at least what |
You should not document current behaviour. You should document the intended behaviour, once it matches what Python actually does. Activating the main interpreter is an easy way to get an interpreter the user wasn't expecting, leading to crashes. We should not encourage that. |
Wouldn't changing the interpreter selected by |
It would, but that's not relevant here. |
I guess there's not much to do here then. I'll look into designing something better for subinterpreters next week. |
cc @encukou, @ericsnowcurrently Ok, I've put quite a bit of work into developing a proof of concept for a better interpreter-switching API. A quick example: int
thread_func(void *arg)
{
PyInterpreterState *subinterp = (PyInterpreterState *) arg;
PyThreadState *main_tstate;
PyThreadState *sub_tstate;
PyInterpreterState_AttachToMain(&main_tstate);
// We're now in the main interpreter
PyInterpreterState_Attach(subinterp, &sub_tstate);
// We're now in the subinterpreter!
PyInterpreterState_Detach(sub_tstate);
// Back in the main interpreter again!
PyInterpreterState_Detach(main_tstate);
// GIL is not held, current tstate is NULL
} Unfortunately, there's a lot of int
thread_func(void *arg)
{
PyInterpreterState *subinterp = (PyInterpreterState *) arg;
// These can be nested infinitely!
Py_ENTER_MAIN_INTERPRETER();
// Do something in the main interpreter...
Py_ENTER_SUBINTERPRETER(subinterp);
// Do something in the subinterpreter...
Py_EXIT_SUBINTERPRETER();
// Back in the main interpreter again!
Py_EXIT_MAIN_INTERPRETER();
// GIL is not held, current tstate is NULL
} The main benefit here is that it's now possible to switch to another thread's interpreter, but you also know exactly what interpreter you're getting, so there's no gotchas if Python happened to have run in a subinterpreter in the current thread. Although, this (now closed) PR doesn't seem like the best place to discuss things. Where should this get moved to? A new issue? DPO? Possibly even Discord? (Or, hopefully not, a PEP.) |
Thank you so much for working on this! I'm afraid that this will need a PEP, to explain to anyone who didn't read the discussions. (I admit I'm not caught up, myself.) I'd prefer not treating the main interpreter as special. Extension authors -- stdlib or otherwise -- don't have a good way of knowing if their extenstion is in the main interpreter or not. If not, what kind of operations would make sense on the main interpreter? It's an environment where no Regarding the PR, I don't see the need for the stack. The |
I was thinking that it would be useful if a). you know that subinterpreters are active, so
That was the idea, yeah. I added the stack to address the problem of I'll talk to Eric first on Discord before writing a PEP draft. At the time when I wrote that implementation, I wasn't aware of the |
The main interpreter will always* exist, but might be doing something unrelated -- for example, it might have Things that aren't clear to me, since we're talking about C callback APIs with limited context being passed around: how do you know that a given |
Looking at the source,
Yeah, but I'm not totally sure how we would implement that right now, I think we just have to trust the user that they're doing the right thing. (Maybe we could add an
From my understanding, you basically don't know. But that's unspecific to the C API-- |
I mostly meant exceptions as an example of what could be “wrong” with attaching to an arbitrary interpreter. I don't think we need to fix them, especially if we don't add
In order to trust them, we should agree on and document what the right thing is -- which might then hint at new APIs to add to make it easier.
IMO, that's one reason for the underscore in |
Oh, I've more or less abandoned I think the only thing we need to document is "only use an object if you're certain that it came from this interpreter," right?
I do think it's important that the C API be more lenient than Python for now, though. Subinterpreters need stress testing--that shouldn't only come from the |
That's the only rule that I'm aware of, yes. (It can be safely broken in some cases, but those are best left as implementation details.) One thing I'm worried about is that this API seems to encourage users to store a |
An interesting thought: should we make it possible to statically allocate an interpreter state? That way, we could add some sort of |
I don't think we can make that the only way to create interpreter states; we'd still need to support dynamically allocated state. |
Yeah, that wasn't my plan, most subinterpreters (mainly allocated by |
@encukou, this needs
skip news
, and backport to both 3.12 and 3.13.PyGILState_
API for per-GIL subinterpreters #123672📚 Documentation preview 📚: https://cpython-previews--123728.org.readthedocs.build/