Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate replacing conda-build with rattler-build #47

Open
vyasr opened this issue Apr 19, 2024 · 7 comments
Open

Evaluate replacing conda-build with rattler-build #47

vyasr opened this issue Apr 19, 2024 · 7 comments
Assignees

Comments

@vyasr
Copy link
Contributor

vyasr commented Apr 19, 2024

RAPIDS currently builds conda packages in CI using conda-build. The rattler-build tool is a newer alternative. It is written in Rust, and should be faster than conda-build (I haven't seen any official benchmarks yet, though). It only supports a limited subset of the meta.yaml recipe format, but that subset is designed to still enable all the same features, just with a more limited syntax (see CEPS 13 and 14). conda-build overhead is nontrivial (I've never benchmarked it, but I know it can stretch into multiple minutes beyond the environment solve when doing local CI reproductions), and reducing that would be quite valuable for us in improving our CI turnaround. Moreover, switching to the more restricted syntax described in the above CEPs would be beneficial because it would convert our conda recipes into pure YAML rather than the extended YAML currently used by meta.yaml. That change is important because the YAML extensions currently in our recipe make it impossible to parse or write with standard YAML parsers, which is a big reason why we have struggled to do things like support meta.yaml files in rapids-dependency-file-generator.

We should do a PoC of replacing conda-build with rattler-build in one repo (preferably something reasonably complex like cudf or cugraph) to see what it would take to make this transition, and how much we would benefit.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 22, 2024

I now have two POCs of using rattler-build

Some notes:

I've littered the PRs liberally with comments, so those should provide additional information.

@msarahan
Copy link

Nice! Looks like you got some pretty good speedup there - maybe around 5 minutes of speedup, from what I can see. Considering the whole build time was 8-10 minutes, saving 5 minutes is huge.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 23, 2024

Yup, similar speedups in absolute numbers for cugraph (smaller percentage, but still not bad to go from 28 mins to 22 mins. Also there's probably an extra 30 s in there for installing rattler-build itself in each job since it's not yet preinstalled in the image (but it will be).

@bdice
Copy link
Contributor

bdice commented Jul 23, 2024

I looked at the two POCs for ucxx and cugraph and I am supportive of this effort. It seems like there are a lot of steps to take first (like fixing strict priority conda builds, maybe altering librmm run exports, and maybe some rattler-build issues -- basically all the things that are marked in the comments) but overall I think this is mature enough that we can invest some time in this direction.

@vyasr
Copy link
Contributor Author

vyasr commented Jul 24, 2024

FYI @wolfv @ruben-arts here's the follow-up to our discussions from SciPy 🙂

@vyasr
Copy link
Contributor Author

vyasr commented Aug 16, 2024

We're actively discussing #84, which is IMO a blocker to making this change. We can work around it, but it requires a lot of extra boilerplate in recipes that I don't think it's worth proliferating.

@msarahan
Copy link

Adding repo tracker for progress tracking:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants