Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start adding a more useful python module around libpacemaker #3813

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

clumens
Copy link
Contributor

@clumens clumens commented Jan 28, 2025

This is very rough at the moment - I've only added a single function. But, I think it's enough to show what kinds of changes would be required and gives us something to talk about. I think this is still a ways off from getting pushed.

These are needed to build C-based extension modules for python.
All the libpacemaker functions return an xmlNode **, which we need to
convert into some sort of native python type if we are to provide a
python module for people to use.

There's various ways we could go about doing this, but what we probably
want to be able to do is convert the C-based xmlNode ** type into
something that libxml2's python module can work with.  Then, we can
write more of the python module in python itself instead of having to
work with libxml2 in C.

The way to do this is to write a small C-based wrapper function for each
libpacemaker function we want to expose in the python module, using the
PyCapsule type.  These wrapper functions are likely to be pretty
formulaic - we could probably autogenerate them from a script if we
wanted.

This introduces a single function for demonstration purposes, plus the
rest of the boilerplate required to construct a python module in C.
This is a public, native python API that wraps libpacemaker.  It aims to
provide a python-based way of interacting with pacemaker, which means we
need to be careful to use exceptions where appropriate, return python
types, and in general write things in a pythonic style.
The presence of the new arch-specific compiled python module requires
various build changes:

* The python3-pacemaker package goes from noarch to arch-specific.

* python3-libxml2 is now required as part of the build process to run
  tests.

* PYTHONPATH needs to be updated in various places to run tests.

return PyCapsule_New(xml, "xmlNodePtr", NULL);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the commit message, I think these functions could almost certainly be auto-generated by a quick script based on a single line description of their name, argument types, and return value.

rc = pcmk_list_standards(&xml);
if (rc != pcmk_rc_ok) {
PyErr_SetString(PacemakerError, pcmk_rc_str(rc));
return NULL;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure what to do here - on the one hand, pcmk_* functions return a standard Pacemaker return code, so that seems like what we should raise. However, we don't expose those return codes in the python module at the moment (which... maybe we should do that?). On the other hand, the returned XML already contains the exit code. However, that's the kind of value a process should return, not a function.


doc = libxml2.xmlDoc(xml)

return [item.getContent() for item in doc.xpathEval("/pacemaker-result/standards/item")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention here (and in any further functions) would be to convert the returned XML into a python native type, like a list of strings here. That's what I'm going through so much trouble everywhere to deal with the XML types.

@clumens clumens added the status: in progress PRs that aren't ready yet label Jan 30, 2025
@clumens clumens changed the title RFC: Start adding a more useful python module around libpacemaker Start adding a more useful python module around libpacemaker Jan 30, 2025
* functions and returns python objects. This is necessary because most
* libpacemaker functions return an xmlNode **, which needs to be coerced
* through the PyCapsule type into something that libxml2's python
* bindings can work with.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage (or necessity) of using PyCapsule, etc., rather than ctypes or cffi for our use case?

I have not created Python bindings myself. Before I do any more research, I'm wondering what you've found on this topic and what led you to PyCapsule.

Note from the Python documentation (first URL):

The C extension interface is specific to CPython, and extension modules do not work on other Python implementations. In many cases, it is possible to avoid writing C extensions and preserve portability to other implementations. For example, if your use case is calling C library functions or system calls, you should consider using the ctypes module or the cffi library rather than writing custom C code. These modules let you write Python code to interface with C code and are more portable between implementations of Python than writing and compiling a C extension module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, using PyCapsule is the only way I've been able to make anything that doesn't just segfault.

Here's what I want:

  • We call a libpacemaker function - say, pcmk_list_standards. That gives us a return code and an xmlNodePtr *.
  • I don't want our python bindings to give the user XML anything. I want the bindings to return python types. So, I need to convert that XML into python.
  • The easiest way of doing that is by using libxml2, which conveniently also has python bindings. That means we can write the type conversion crud in python instead of C, which sounds a lot better considering we want the output to be python.

Basically, I want to take something that was allocated and generated by libxml2 in C (the xmlNodePtr *) and pass it through our own python code and libxml2's python bindings, back into libxml2 in C during the process of converting the types. There's something about this process that just segfaults unless I use the PyCapsule type. Almost everything else in this PR is just boilerplate necessary to be able to use that type.

My first approach was to use ctypes just like we are doing for ExitStatus, but I couldn't get any result besides segfaults there. I think what we'd need to do is define a structure class that mirrors an xmlNode, which seems like way too much work. That's why I turned to this approach. I haven't tried the ffi module yet.

Honestly, I would love to nuke the entire C module portion of this and do everything in python with ctypes/ffi. I just can't figure it out yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance:

clumens@spire ~/src/pacemaker main|✚1❯ git diff
diff --git a/python/pacemaker/_library.py b/python/pacemaker/_library.py
index 1cd1bd500d..fe8d83fa25 100644
--- a/python/pacemaker/_library.py
+++ b/python/pacemaker/_library.py
@@ -1,6 +1,6 @@
 """A module providing private library management code."""
 
-__all__ = ["_libcrmcommon"]
+__all__ = ["_libcrmcommon", "_libpacemker"]
 __copyright__ = "Copyright 2024-2025 the Pacemaker project contributors"
 __license__ = "GNU Lesser General Public License version 2.1 or later (LGPLv2.1+)"
 
@@ -36,3 +36,7 @@ def load_library(basename):
 
 _libcrmcommon = load_library("crmcommon")
 _libcrmcommon.crm_exit_str.restype = ctypes.c_char_p
+
+_libpacemaker = load_library("pacemaker")
+_libpacemaker.pcmk_list_standards.restype = ctypes.c_int
+_libpacemaker.pcmk_list_standards.argtypes = [ ctypes.POINTER(ctypes.c_void_p) ]
clumens@spire ~/src/pacemaker main|✚1❯ PYTHONPATH=python LD_LIBRARY_PATH=lib/common/.libs python
Python 3.12.7 (main, Oct  1 2024, 00:00:00) [GCC 14.2.1 20240912 (Red Hat 14.2.1-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pacemaker
>>> import ctypes
>>> import libxml2
>>> xxx = ctypes.pointer(ctypes.c_void_p())
>>> pacemaker._library._libpacemaker.pcmk_list_standards(xxx)
0
>>> libxml2.xmlDoc(xxx)
<xmlDoc (None) object at 0x7febd4df11c0>
>>> libxml2.xmlDoc(xxx).xpathEval("/pacemaker-result")
zsh: segmentation fault  PYTHONPATH=python LD_LIBRARY_PATH=lib/common/.libs python

It feels really close - in particular, I don't think the void pointer type is correct here. But I'm not sure how to hand it an xmlNodePtr. ctypes really only knows about basic types. Beyond that, you're supposed to define classes that derive from Structure. But, an xmlNode comes from libxml2 so I don't want to define a class for that and I don't see anywhere that they define one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: in progress PRs that aren't ready yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants