Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use conda-lock to speed up builds #12

Open
4 tasks
aryarm opened this issue Dec 12, 2022 · 2 comments
Open
4 tasks

Use conda-lock to speed up builds #12

aryarm opened this issue Dec 12, 2022 · 2 comments

Comments

@aryarm
Copy link
Collaborator

aryarm commented Dec 12, 2022

Our conda environments are re-solved upon each build but, in theory, a re-solve is only necessary when a yml environment file changes.

  • We should use conda-lock to create lock files for each yml file, and we should make sure to cache them
    • This might require creating a separate portion of our Dockerfile for this
  • Whenever a yml file changes, we should detect it and go update the corresponding lock file in the cache
  • Then, within our Dockerfile, we should copy the lock files from the cache into each build and then use them to install each environment

See also this description of how to use conda-lock with Docker.

@aryarm
Copy link
Collaborator Author

aryarm commented Jan 2, 2023

Winter Break Update

The second example in that link above does pretty much exactly what we want. It uses something called multi-stage builds to create the conda environment in a slimmed down linux container so that the built conda environment can be copied into a separate container later on.

Unfortunately, it isn't clear whether each stage in a multi-stage build will be cached separately. Apparently, there is a way to do this, but I tried it and it didn't work. I got an error in the build stage:

buildx failed with: ERROR: failed to solve: specifying multiple cache exports is not supported currently

Luckily I found this comment that seems to explain how to get it to work in Github actions. The key is to create different cache folders for each stage. I don't have time to try this right now - but this should be the next thing to look into.

According to the conda-lock documentation, multi-stage builds also make it easier for us to make our conda setups leaner because we can just do it at the end of the build stage so that the extra layers don't get added to the runtime image. There's a great article about how to do that here.

@aryarm
Copy link
Collaborator Author

aryarm commented Mar 1, 2023

another thing I just realized is that deletion of the cache will also "unlock" our locked envs, since our current design only stores the lock files in the cache and not in our github repo

to ensure that this doesn't happen, we could consider exporting the lock file out of the build container using the output: parameter like this

          output: type=local,dest=locked

This way, the files in the root directory of our container should be copied into a locked/ directory within our Github repo after the build

resources

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant