Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer Runner Deps from Recipe requirements.txt #154

Closed
ranchodeluxe opened this issue Nov 27, 2023 · 8 comments
Closed

Infer Runner Deps from Recipe requirements.txt #154

ranchodeluxe opened this issue Nov 27, 2023 · 8 comments
Assignees
Labels
question Further information is requested

Comments

@ranchodeluxe
Copy link
Collaborator

ranchodeluxe commented Nov 27, 2023

Problem

I apologize for the range of this question b/c I got here through some Flink runner frustrations. For the runner we currently have a bunch of unlisted runtime dependencies that need to be installed to execute pangeo-forge-runner bake command:

# the storage look up and dependency injections require some form of these deps 
fsspec  # or possibly s3fs, gsfs depending on what's used in config
pangeo-forge-recipes
apache-beam  # which has to match the exact python and beam versions for the producer and worker (Flink specific)

Possible Solution

AFAIK for pangeo-forge-recipes>=0.10.0 we require a requirements.txt to be in the feedstock. So I want to build on
(#130) and ask if there any reason we can't do something automated (but slightly gross) like this really basic example diff?

Thoughts?

@ranchodeluxe
Copy link
Collaborator Author

ranchodeluxe commented Nov 27, 2023

If the above solution is too gross (which I understand) another option is to write a bunch of validation (most of it just Flink specific) that checks things like:

  • more feedback about what needs to be installed in Bake.start (like we already do for pangeo-forge-recipes but for everything not explicitly in pyproject.toml deps)
  • Flink specific: make sure the version of apache-beam installed on the host matches their FlinkOperatorBakery.container image passed
  • Flink specific: allow them to not pass the container version and infer what it is based on simple heuristics
  • Always install fsspec? Maybe it should just exist in our pyproject.toml and be default installed?

@ranchodeluxe ranchodeluxe changed the title Infer Runner Deps from Recipe Requirements.txt Infer Runner Deps from Recipe requirements.txt Nov 27, 2023
@ranchodeluxe ranchodeluxe added the question Further information is requested label Nov 27, 2023
@sbquinlan
Copy link
Member

I had this issue recently. Here's how you can easily get into a messed up state:

conda create env -n pangeo_test -c conda-forge pangeo-forge-recipes
pip install "pangeo-forge-runner[extras]"

pangeo-forge-runner has an optional dependency of apache-beam[gcp] which last exists in v2.42.0. pip will install that version over the latest version that pangeo-forge-recipes installs from conda-forge (v2.51.0 at time of writing). When you run a FlinkDeployment using the runner, it'll correctly build the FlinkDeployment with whatever container_image you specify, but the pipeline.run() call will fail because apache-beam will default to the old version v2.42.0 which isn't compatible with the latest Flink version.

I think it's important here to either pass down a specific jar in the pipeline options or validate that the apache-beam version installed is similar to what is being configured in the FlinkDeployment.

@cisaacstern
Copy link
Member

Sorry for the delayed response here.

I certainly agree that client/runner version mismatch is a huge headache and we should do something about that.

To understand, is this issue in a way a version of #27, in which @yuvipanda proposed having a separate fetcher/environment provisioner?

@ranchodeluxe
Copy link
Collaborator Author

ranchodeluxe commented Dec 5, 2023

To understand, is this issue in a way a version of #27, in which @yuvipanda proposed having a separate fetcher/environment provisioner?

No, this example diff isn't intended to be a solution in that direction. But I guess the question is that ticket (which seems to be over a year old and referring to the old "orchestrator" idea) still the vision? First time seeing it

@cisaacstern
Copy link
Member

cisaacstern commented Dec 5, 2023

But I guess the question is that ticket (which seems to be over a year old and referring to the old "orchestrator" idea) still the vision?

This is a good point for discussion. Regarding references to the "orchestrator", I think we can view #27 as if that does not refer to pangeo-forge/pangeo-forge-orchestrator, and rather means just "little o" orchestrator, as in the person or whom/whatever is calling pangeo-forge-runner to deploy jobs.

One idea mentioned in #27 which definitely feels worthwhile to me, is that if we are going to approach automatic environment modification in any form, that should definitely be happening in temporary venvs of some sort, as is (IIUC) done in pre-commit, tox, nox, etc.

Over in the GitHub Action, I do a version of what your example diff is proposing:

https://github.com/pangeo-forge/deploy-recipe-action/blob/868f7dd2bd02ec506c7c822f958813373ebc751b/action/deploy_recipe.py#L115-L118

Which feels appropriate to me because the Action is only ever called within the context of an ephemeral GitHub Action container, so we are not going to be modifying (human) user environments on-the-fly with that code.

In pangeo-forge-runner itself, though, modifying the same environment that the CLI is invoked in feels innappropriate, and could conceivably lead to even more confusion, insofar as users environments are being changed underneath them.

I would be curious to understand more about how pre-commit, tox, nox etc. manage ephemeral venvs, and if there is something we can borrow from any of those projects on this topic.

@ranchodeluxe
Copy link
Collaborator Author

In pangeo-forge-runner itself, though, modifying the same environment that the CLI is invoked in feels innappropriate, and could conceivably lead to even more confusion, insofar as users environments are being changed underneath them.

I would be curious to understand more about how pre-commit, tox, nox etc. manage ephemeral venvs, and if there is something we can borrow from any of those projects on this topic.

Agreed, let me start looking into options down this pathway

@yuvipanda
Copy link
Collaborator

YESSSS I THINK STEALING WHAT pre-commit DOES IS EXACTLY THE RIGHT WAY TO GO HERE

@yuvipanda
Copy link
Collaborator

fucking amazing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants