-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Python Stable ABI for the bindings #2674
Conversation
These will not be used with limited API.
This creates a type object and adds it to the module. (We need to do this to avoid compile-time static type objects, whose memory layout changes between Python versions.)
Commit generated with this semantic patch (see https://coccinelle.lip6.fr ): ``` @@ identifier T; fresh identifier S = T ## "_Spec"; type PO; expression name; @@ - Py_INCREF(&T); - PyModule_AddObject(m, name, (PO) &T); + if (!initAndAddType(m, &T, &S, name)) { + return 0; + } ``` applied using: spatch --sp-file patch.cocci --dir python/ --in-place
The Python type struct changes between versions, and thus direct access to its fields is not allowed in Stable ABI. The members can, however, be retrieved using `PyType_GetSlot`. Commit generated with this semantic patch (see https://coccinelle.lip6.fr ): ``` @@ identifier s; type t; @@ - Py_TYPE(s)->tp_free((t*) s); + Py_TYPE(s)->tp_free(s); @@ @@ - Py_TYPE(s)->tp_free(s); + freefunc free = PyType_GetSlot(Py_TYPE(s), Py_tp_free); + free(s); ``` applied using: spatch --sp-file patch.cocci --dir python/ --in-place
The Python type struct changes between versions, and thus direct access to its fields is not allowed in Stable ABI. The members can, however, be retrieved using `PyType_GetSlot`. Commit generated with this semantic patch (see https://coccinelle.lip6.fr ): ``` @@ identifier subtype; type t; expression size; identifier var; @@ - t var = (t)subtype->tp_alloc(subtype, size); + allocfunc subtype_alloc = (allocfunc)PyType_GetSlot(subtype, Py_tp_alloc); + t var = (t)subtype_alloc(subtype, size); @@ identifier subtype; type t; expression size; identifier var; @@ - var = (t)subtype->tp_alloc(subtype, size); + allocfunc subtype_alloc = (allocfunc)PyType_GetSlot(subtype, Py_tp_alloc); + var = (t)subtype_alloc(subtype, size); ``` applied using: spatch --sp-file patch.cocci --dir python/ --in-place
In the stable ABI Python's type objects aren't defined in structs that reference other structs, but by an array of "slots" from which type objects are created. (This commit was generated semi-automatically. The automation is too messy and bespoke to share widely, but if you're interested in undocumented work in progress, look in https://github.com/encukou/api-limiter/tree/rpm-mess )
The previous commit removed all *_Type variables. This adds them back, but as *pointers* to Python type objects rather than type objects themselves. All usages now need an extra pointer dereference (or removing an address-of). (The compiler will complain of type safety if we miss a spot, so I'm OK using the original names.) Also, add header declarations for the Spec structs.
The macros are not part of the limited ABI. The functions check for type and out-of-bounds arguments, but as far as I know the slowdown is not measurable outside micro-benchmarks. But, the checking means that the functions can fail with an exception, and nominally need error handling. That's left for another commit. Another difference is that if the collection already contains an item at the given position, the SetItem functions discard a reference to it, while the SET_ITEM macros don't. I've checked that the changed cases are used to fill brand-new lists/tuples, so this doesn't apply. Commit generated with this semantic patch (see https://coccinelle.lip6.fr ): ``` @@ expression C, N, O; @@ - PyTuple_SET_ITEM(C, N, O) + PyTuple_SetItem(C, N, O) @@ expression C, N, O; @@ - PyList_SET_ITEM(C, N, O) + PyList_SetItem(C, N, O) ``` applied using: spatch --sp-file patch.cocci --dir python/ --in-place
Each call to a Python API (or any function returning PyObject*, really) should have error handling.
We store pointers to our Python type objects in global variables, which would get clobbered if the initialization code could run several times. Explicitly disallow that. This means the extension cannot be unloaded and reloaded, nor used in multiple Python interpreters. The limitation could be lifted in the future.
Some of rpm's types can't be instantiated by calling the class. The flag for this was only added in Python 3.10. Replace it by a tp_new slot with the same effect: raising an exception.
This is fine, we can work in subsequent PRs to handle it internally for older CMake versions. |
I think, if we go for the stable ABI we should then stick to it too, meaning simply cut off support for versions where not available. Requiring Python >= 3.7 is not a problem, I'm just curious as to what makes that particular version special. Cmake >= 3.26 is too new for us to require though, but this shouldn't be hard to do manually. Looking at the cmake's FindPython module, something like |
@pmatilai I would just vendor/backport the FindPython module and conditionally use it if CMake is too old. |
I did look at the FindPython module, and to me that looks just the kind of gibberish we left autotools behind for. No thanks. I'm fine importing it as long as it's somebody elses headache but that's the end of that. |
Well, I guess then we should hope for my CentOS Stream MR to be merged? 😅 |
Nah. It ain't that hard really. And, we can't go and require the latest-and-greatest of a core dependency just because one distro updates their cmake. |
As it stands, we gracefully fall back to the current behavior anyway, so I don't think the CMake stuff is a reason to hold up this PR. |
See my earlier comment, I don't WANT it to fall back to current behavior because then you never really know whether a patch is SABI compliant or not because your setup may depend on these factors. |
Here's what's different for stable ABI builds:
So, yeah, adding |
Would this work?
(I haven't tested, I'm still not sure what should be done) |
I think that's still unnecessarily complicated 😅 In a case like this using cmake's integration doesn't help because it's only more code and fu to support. I don't really want to see a single option related to this because those are just extra baggage that can break and need testing. Just use the manual cmake thing always, unconditionally, and update the Python requirement in INSTALL to 3.7. We'll switch to the cmake integrated Development.SABIModule version when we some day bump our cmake version to one that supports it. Apologies for not clearly stating this earlier. |
The cmake nits aside, I briefly skimmed through this and seems like a lot of fairly mechanical transformations (even saw a mention of coccinelle in there), and nothing in the "OMG we can't do that" department. So just trim out the extra complications from the cmake department and I'll be happy to merge. |
I'd prefer the "Fix comment style" commit to be merged into the previous commit that introduces the comment, but other than that looks fine to me. Unless you have something else you're planning to work on here, just flag ready for review and I'll merge, we already did the review part here. |
The flag was added in Python 3.10. If the current Python doesn't have it, rpm's type objects (not necessarily instances of the types) will be mutable. This allows the user to shoot themselves in the foot, but it's not a big deal.
I did that, and I also updated the version in the I didn't test this build config with all the Python versions, and I won't get to that next week, but will do it the week of Oct 16. I find any issues, I can open a fixup PR. |
Nice, thanks! And yup, if it breaks, we'll find out sooner or later 😄 It's passing with a modern version, the older versions are diminishling important. |
These are the changes needed to use Python's stable ABI. This allows the extension to be used without recompilation with multiple versions of Python (from 3.7 until an ABI break).
Hack to make this easier:
There is no support for installing for, nor testing with, multiple versions of Python at once. That's left up to distributors for now. (AFAIK it's a bit of uncharted territorry for projects that aren't Python-first.)
The individual commits carry explanations for each kind of change, but the in-between commits won't necessarily build. (i.e. you'd need a Git merge commit to preserve both messages and bisectability.)
I am asking for help with the build system. As it is (in the last commit), it's a bit chaotic:
cmake -DENABLE_PYTHON_SABI=OFF
(e.g. to allow an older Python version)ENABLE_PYTHON_SABI=OFF
, stable ABI is not usedIf that's fine, I can put it in the build documentation.
If it would be better to build stable ABI even with older CMake, it shouldn't be hard to set the flag manually instead of using CMake's integration.
I've built and ran the tests with Python 3.6-3.12. Sadly I don't have easy access to 3.5 and below.
Fixes: #2345