Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan for Python bindings #1

Open
fcooper8472 opened this issue May 30, 2024 · 9 comments
Open

Plan for Python bindings #1

fcooper8472 opened this issue May 30, 2024 · 9 comments

Comments

@fcooper8472
Copy link
Collaborator

I propose adding minimal infrastructure to allow installation and use of this tool via Python.

Assuming that it ends up being called "rhe" on PyPi (which has some restrictions on naming), I would envisage an end user being able to:

pip install rhe

and then, from a Python script:

from rhe import genie
from rhe import genie_mem
from rhe import genie_multi_pheno


genie(<args>)

This functionality would, of course, be in addition to the existing executables that a user can still compile and run exactly as described currently in the README.

I would propose using scikit-build-core and pybind11 to achieve this.

In addition, I would propose adding a GitHub Actions workflow to automate the building of Python wheels for all current Python versions, and, ideally, on Linux, macOS and Windows (although there may be platform-specific code that I'm currently unaware of that may prevent this).

If this sounds broadly acceptable as a path forward, let me know, and I will begin working on it.

@sriramsr
Copy link
Collaborator

sriramsr commented May 30, 2024 via email

@fcooper8472
Copy link
Collaborator Author

@sriramsr a couple of points that have come up so far:

  1. Should the SSE support be used where possible? It seems to work with -DSSE_SUPPORT=1, but by default it is off, and there's no info in the README about turning it on. Would it be recommended to turn it on?
  2. Currently the code can't (easily) be compiled on Windows because of some posix specific headers.
  3. Currently the code can't (easily) be compiled on arm64 architecture because of the emmintrin header, so currently it won't compile (and therefore we can't build wheels for) arm64 macOS machines.

@fcooper8472
Copy link
Collaborator Author

Also, in std.h there is a function:

unsigned long long rdtsc(){
    unsigned int lo,hi;
    __asm__ __volatile__ ("rdtsc": "=a" (lo), "=d" (hi));
    return ((unsigned long long)hi << 32) | lo;
}

which doesn't appear to be used anywhere. Do you know whether it's OK to get rid of this? Because it uses inline amd64 assembly, it would be ideal to get rid of it when working towards arm64 support.

@fcooper8472
Copy link
Collaborator Author

In fact: the current examples do not run when SSE is enabled. See here for an example that fails:

https://github.com/sriramlab/GENIE/actions/runs/9387244040/job/25849680141

GENIE: /home/runner/work/GENIE/GENIE/include/mailman.h:85: void mailman::fastmultiply_sse(int, int, int, std::vector<int> &, Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor> &, double *, double *, double **): Assertion `k%10 == 0 && "k should be a multiple of 10"' failed.

@fcooper8472
Copy link
Collaborator Author

  1. Currently the code can't (easily) be compiled on arm64 architecture because of the emmintrin header, so currently it won't compile (and therefore we can't build wheels for) arm64 macOS machines.

Removing the inline assembly and conditionally only including the emmintrin if the architecture is appropriate allows GENIE to compile and run on arm64 macOS.

@sriramsr
Copy link
Collaborator

sriramsr commented Jun 5, 2024 via email

@fcooper8472
Copy link
Collaborator Author

@sriramsr this work is mostly done now: I have built Python wheels for Linux and both arm64 and amd64 macOS. Just a few questions:

  1. The project did not appear to depend on GSL. Is this expected?
  2. Boost only appeared to be used to generate random numbers from N(0,1). I replaced this with using the C++ standard library random number generator from C++11, which works in an almost identical way to boost. Is this OK, or was there a specific reason to use Boost? One possible reason is that using Boost will guarantee the same sequence of random numbers for a given seed across all distributions, whereas in the standard library, different vendors are permitted to choose their own algorithm. My strong preference would be to not use boost unless this is a requirement.
  3. When the time comes, are you happy for me to upload to PyPI using my personal PyPI credentials? This would list me as the "maintainer" on PyPI (e.g. as here: https://pypi.org/project/arg-needle-lib/), but the homepage links would point to your repository, of course. The alternative would be for you to create a PyPI token and add it to the GENIE repository as a GitHub Actions Secret.

@sriramsr
Copy link
Collaborator

sriramsr commented Jun 13, 2024 via email

@fcooper8472
Copy link
Collaborator Author

A first draft of this is now complete. See here for a summary of changes:
#2 (comment)

Once you're happy with the changes, I'll upload the wheels to PyPI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants