Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Python Stable ABI for the bindings #2674

Merged
merged 13 commits into from
Oct 5, 2023

Conversation

encukou
Copy link
Contributor

@encukou encukou commented Sep 21, 2023

These are the changes needed to use Python's stable ABI. This allows the extension to be used without recompilation with multiple versions of Python (from 3.7 until an ABI break).

Hack to make this easier:

  • type objects are allocated when the module loads, stored in global pointers, and never released
  • the module may only be loaded once per process, so each type is leaked at most once, and the global pointers aren't clobbered.

There is no support for installing for, nor testing with, multiple versions of Python at once. That's left up to distributors for now. (AFAIK it's a bit of uncharted territorry for projects that aren't Python-first.)

The individual commits carry explanations for each kind of change, but the in-between commits won't necessarily build. (i.e. you'd need a Git merge commit to preserve both messages and bisectability.)

I am asking for help with the build system. As it is (in the last commit), it's a bit chaotic:

  • With CMake 3.26+, stable ABI will be used by default.
    • This requires Python 3.7+
    • This can be turned off using cmake -DENABLE_PYTHON_SABI=OFF (e.g. to allow an older Python version)
  • With older CMake, or ENABLE_PYTHON_SABI=OFF, stable ABI is not used

If that's fine, I can put it in the build documentation.
If it would be better to build stable ABI even with older CMake, it shouldn't be hard to set the flag manually instead of using CMake's integration.

I've built and ran the tests with Python 3.6-3.12. Sadly I don't have easy access to 3.5 and below.

Fixes: #2345

These will not be used with limited API.
This creates a type object and adds it to the module.
(We need to do this to avoid compile-time static type objects,
whose memory layout changes between Python versions.)
Commit generated with this semantic patch (see https://coccinelle.lip6.fr ):
```
@@
identifier T;
fresh identifier S = T ## "_Spec";
type PO;
expression name;
@@

-   Py_INCREF(&T);
-   PyModule_AddObject(m, name, (PO) &T);
+   if (!initAndAddType(m, &T, &S, name)) {
+       return 0;
+   }
```

applied using:

    spatch --sp-file patch.cocci --dir python/ --in-place
The Python type struct changes between versions, and thus direct access to
its fields is not allowed in Stable ABI.
The members can, however, be retrieved using `PyType_GetSlot`.

Commit generated with this semantic patch (see https://coccinelle.lip6.fr ):
```
@@
identifier s;
type t;
@@

- Py_TYPE(s)->tp_free((t*) s);
+ Py_TYPE(s)->tp_free(s);

@@
@@

- Py_TYPE(s)->tp_free(s);
+ freefunc free = PyType_GetSlot(Py_TYPE(s), Py_tp_free);
+ free(s);
```

applied using:

    spatch --sp-file patch.cocci --dir python/ --in-place
The Python type struct changes between versions, and thus direct access to
its fields is not allowed in Stable ABI.
The members can, however, be retrieved using `PyType_GetSlot`.

Commit generated with this semantic patch (see https://coccinelle.lip6.fr ):
```
@@
identifier subtype;
type t;
expression size;
identifier var;
@@

- t var = (t)subtype->tp_alloc(subtype, size);
+ allocfunc subtype_alloc = (allocfunc)PyType_GetSlot(subtype, Py_tp_alloc);
+ t var = (t)subtype_alloc(subtype, size);

@@
identifier subtype;
type t;
expression size;
identifier var;
@@

- var = (t)subtype->tp_alloc(subtype, size);
+ allocfunc subtype_alloc = (allocfunc)PyType_GetSlot(subtype, Py_tp_alloc);
+ var = (t)subtype_alloc(subtype, size);
```

applied using:

    spatch --sp-file patch.cocci --dir python/ --in-place
In the stable ABI Python's type objects aren't defined in structs
that reference other structs, but by an array of "slots" from which
type objects are created.

(This commit was generated semi-automatically. The automation
is too messy and bespoke to share widely, but if you're interested
in undocumented work in progress, look in
https://github.com/encukou/api-limiter/tree/rpm-mess )
The previous commit removed all *_Type variables.
This adds them back, but as *pointers* to Python type objects
rather than type objects themselves.

All usages now need an extra pointer dereference (or removing an
address-of).
(The compiler will complain of type safety if we miss a spot,
so I'm OK using the original names.)

Also, add header declarations for the Spec structs.
The macros are not part of the limited ABI.

The functions check for type and out-of-bounds arguments, but
as far as I know the slowdown is not measurable outside
micro-benchmarks.
But, the checking means that the functions can fail with an
exception, and nominally need error handling. That's left
for another commit.

Another difference is that if the collection already contains
an item at the given position, the SetItem functions discard
a reference to it, while the SET_ITEM macros don't.
I've checked that the changed cases are used to fill brand-new
lists/tuples, so this doesn't apply.

Commit generated with this semantic patch (see https://coccinelle.lip6.fr ):
```
@@
expression C, N, O;
@@

- PyTuple_SET_ITEM(C, N, O)
+ PyTuple_SetItem(C, N, O)

@@
expression C, N, O;
@@

- PyList_SET_ITEM(C, N, O)
+ PyList_SetItem(C, N, O)
```

applied using:

    spatch --sp-file patch.cocci --dir python/ --in-place
Each call to a Python API (or any function returning PyObject*,
really) should have error handling.
We store pointers to our Python type objects in global variables,
which would get clobbered if the initialization code could run
several times. Explicitly disallow that.

This means the extension cannot be unloaded and reloaded, nor used
in multiple Python interpreters. The limitation could be lifted
in the future.
Some of rpm's types can't be instantiated by calling the class.
The flag for this was only added in Python 3.10.
Replace it by a tp_new slot with the same effect: raising an exception.
@Conan-Kudo
Copy link
Member

With CMake 3.26+, stable ABI will be used by default.

This is fine, we can work in subsequent PRs to handle it internally for older CMake versions.

@pmatilai
Copy link
Member

pmatilai commented Sep 29, 2023

I think, if we go for the stable ABI we should then stick to it too, meaning simply cut off support for versions where not available.

Requiring Python >= 3.7 is not a problem, I'm just curious as to what makes that particular version special. Cmake >= 3.26 is too new for us to require though, but this shouldn't be hard to do manually. Looking at the cmake's FindPython module, something like target_compile_definitions(_rpm PRIVATE Py_LIMITED_API=x) (where x is the desired version obviosly) for in python/CMakeLists.txt I suppose. There's something about linking to a specific library too, but according to https://docs.python.org/3/c-api/stable.html that's a Windows-only thing.

@Conan-Kudo
Copy link
Member

Conan-Kudo commented Sep 29, 2023

@pmatilai I would just vendor/backport the FindPython module and conditionally use it if CMake is too old.

@pmatilai
Copy link
Member

I did look at the FindPython module, and to me that looks just the kind of gibberish we left autotools behind for. No thanks. I'm fine importing it as long as it's somebody elses headache but that's the end of that.

@Conan-Kudo
Copy link
Member

Well, I guess then we should hope for my CentOS Stream MR to be merged? 😅

@pmatilai
Copy link
Member

pmatilai commented Sep 29, 2023

Nah. It ain't that hard really. And, we can't go and require the latest-and-greatest of a core dependency just because one distro updates their cmake.

@Conan-Kudo
Copy link
Member

As it stands, we gracefully fall back to the current behavior anyway, so I don't think the CMake stuff is a reason to hold up this PR.

@pmatilai
Copy link
Member

See my earlier comment, I don't WANT it to fall back to current behavior because then you never really know whether a patch is SABI compliant or not because your setup may depend on these factors.

@encukou
Copy link
Contributor Author

encukou commented Oct 2, 2023

Requiring Python >= 3.7 is not a problem, I'm just curious as to what makes that particular version special.

PySlice_GetIndicesEx uses a deprecated implementation (on all Python versions) if ABI compatibility with 3.6 is needed.
That implementation is only unsafe in rather exotic edge cases (indexing by an object that has side-effects on conversion to int, or another thread changing a sequence size), but it does emit a compile-time warning.
If y'all can live with the warning, I think (but haven't checked), that would allow ABI compatibility with Python 3.4+. But, IMO it's better to use version-specific ABI for such old Python versions.

Requiring Python >= 3.7 is not a problem, I'm just curious as to what makes that particular version special. Cmake >= 3.26 is too new for us to require though, but this shouldn't be hard to do manually. Looking at the cmake's FindPython module, something like target_compile_definitions(_rpm PRIVATE Py_LIMITED_API=x) (where x is the desired version obviosly) for in python/CMakeLists.txt I suppose. There's something about linking to a specific library too, but according to https://docs.python.org/3/c-api/stable.html that's a Windows-only thing.

Here's what's different for stable ABI builds:

  • Compiling with Py_LIMITED_API=0x03070000, so only “stable” API is allowed.
  • Naming the .so appropriately. Currently cmake outputs _rpm.so, which doesn't need to change -- any Python will load it. In the future CMake might switch to _rpm.cpython-311-x86_64-linux-gnu.so/_rpm.abi3.so, but I bet it'll use a cmake_policy for that kind of change, so there'll be plenty of time to adjust.
  • On Windows, linking with a different Python library if appropriate

So, yeah, adding target_compile_definitions(_rpm PRIVATE Py_LIMITED_API=Py_LIMITED_API=0x03070000) will do the trick.
But it might be better to use CMake's Development.SABIModule if it's available?

@encukou
Copy link
Contributor Author

encukou commented Oct 3, 2023

Would this work?

  • CMake 3.26+ default:
    • Build for stable ABI using CMake's mechanism
    • Needs Python 3.7+
  • Older CMake, and builds with -DENABLE_PYTHON_SABI=OFF:
    • Needs Python 3.2+
    • On Python 3.7+, build for stable ABI "manually" by defining

(I haven't tested, I'm still not sure what should be done)

@pmatilai
Copy link
Member

pmatilai commented Oct 3, 2023

I think that's still unnecessarily complicated 😅

In a case like this using cmake's integration doesn't help because it's only more code and fu to support. I don't really want to see a single option related to this because those are just extra baggage that can break and need testing. Just use the manual cmake thing always, unconditionally, and update the Python requirement in INSTALL to 3.7. We'll switch to the cmake integrated Development.SABIModule version when we some day bump our cmake version to one that supports it.

Apologies for not clearly stating this earlier.

@pmatilai
Copy link
Member

pmatilai commented Oct 3, 2023

The cmake nits aside, I briefly skimmed through this and seems like a lot of fairly mechanical transformations (even saw a mention of coccinelle in there), and nothing in the "OMG we can't do that" department.

So just trim out the extra complications from the cmake department and I'll be happy to merge.

@pmatilai
Copy link
Member

pmatilai commented Oct 4, 2023

I'd prefer the "Fix comment style" commit to be merged into the previous commit that introduces the comment, but other than that looks fine to me. Unless you have something else you're planning to work on here, just flag ready for review and I'll merge, we already did the review part here.

The flag was added in Python 3.10.
If the current Python doesn't have it, rpm's type objects (not necessarily
instances of the types) will be mutable.
This allows the user to shoot themselves in the foot, but it's not
a big deal.
@encukou
Copy link
Contributor Author

encukou commented Oct 5, 2023

I did that, and I also updated the version in the INSTALL file.

I didn't test this build config with all the Python versions, and I won't get to that next week, but will do it the week of Oct 16. I find any issues, I can open a fixup PR.

@encukou encukou marked this pull request as ready for review October 5, 2023 09:12
@pmatilai
Copy link
Member

pmatilai commented Oct 5, 2023

Nice, thanks!

And yup, if it breaks, we'll find out sooner or later 😄 It's passing with a modern version, the older versions are diminishling important.

@pmatilai pmatilai merged commit 3a12a89 into rpm-software-management:master Oct 5, 2023
1 check passed
@encukou encukou deleted the python-abi3 branch October 5, 2023 12:00
@dmnks dmnks changed the title WIP: Use Python Stable ABI for the bindings Use Python Stable ABI for the bindings Nov 28, 2023
@dmnks dmnks added the API API related label Nov 28, 2023
@dmnks dmnks added highlight Release highlight RFE python Python bindings labels Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API API related highlight Release highlight python Python bindings RFE
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFE: Use Python Stable ABI for the bindings
4 participants