Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop explicit dependency on pangeo-forge-recipes #130

Merged
merged 5 commits into from
Nov 16, 2023
Merged

Conversation

cisaacstern
Copy link
Member

Working on Pangeo Forge with @jbusecke this week.

We're hitting up against the fact that in various contexts the explicit dependency on pangeo-forge-recipes here is at best inefficient and at worst problematic.

In production workflows, we dynamically install a specific version of pangeo-forge-recipes on the client anyway, so the version installed here is extraneous (and might pull in unnecessary dependencies of it's own, etc.).

@ranchodeluxe @yuvipanda any objections?

@yuvipanda
Copy link
Collaborator

I think the test failures are all real and need to be dealt with.

Dropping this dependency was a goal for me too! The primary question is what happens <0.9.2 or whatever, where we have code that actually imports things directly from pangeo_forge_recipes. How do you want to handle that?

@ranchodeluxe
Copy link
Collaborator

No objections. Outside of tests it looks like this is the most important section of use so maybe we have those classes in both repos?

Copy link

codecov bot commented Nov 15, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (b8571e1) 96.06% compared to head (4301048) 96.09%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #130      +/-   ##
==========================================
+ Coverage   96.06%   96.09%   +0.02%     
==========================================
  Files          14       14              
  Lines         458      461       +3     
==========================================
+ Hits          440      443       +3     
  Misses         18       18              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cisaacstern cisaacstern added test-dataflow Add this label to PRs to trigger Dataflow integration test. test-flink Add this label to PRs to trigger Dataflow integration test. labels Nov 15, 2023
@cisaacstern
Copy link
Member Author

I think the test failures are all real and need to be dealt with.

Yes! That was because we needed to give one of the bakery-specific optional installs in order to get beam in the test environment. I chose .[flink] in the last commit, because that's lighter-weight (doesn't bring in all the GCP stuff).

The primary question is what happens <0.9.2 or whatever, where we have code that actually imports things directly from pangeo_forge_recipes. How do you want to handle that?

My assumption is that pangeo-forge-runner does require -recipes to be installed to work. This should be true regardless of if we are in the 0.9.x series or the 0.10.x series, as in both cases, calling bake imports objects from recipe modules that rely on -recipes. So in either case, to call bake, the user just has to install their desired version of -recipes first. But to call non-bake commands (e.g. expand-meta), -recipes should not be required.

Outside of tests it looks like this is the most important section of use so maybe we have those classes in both repos?

Yes! This is already the case!

@cisaacstern
Copy link
Member Author

cisaacstern commented Nov 15, 2023

AFAICT, the only tests failing now are the Flink tests which use recipes version 0.9.4, but a bit of digging reveals that these tests actually never passed, with the reason for the prior (silent) failures being the exact issue that this PR solves 😅. See:

In each of these test runs from #114, the job is supposedly parametrized with recipes 0.9.4, but the above logs show that recipes 0.10.3 was actually installed, allowing the test to pass. This accidental upgrade happened because recipes was specified as a dependency of runner, with >= bound, so the latest version of recipes at the time of the test run was pulled in when runner was installed in the job.

The current PR fixes that issue by dropping recipes as a dependency, and in so doing reveals that Flink does not work for recipes 0.9.4.

I am totally comfortable with incomplete support for 0.9.4 and to prevent this blocking the current PR, I'll bail out of those test cases with pytest.xfail.

@cisaacstern
Copy link
Member Author

@yuvipanda @ranchodeluxe thank you both for the reviews! I've now xfailed the Flink + recipes 0.9.4 tests, as those timeout (and, as noted above, apparently never passed in the first place).

Also, I've added a new check in the bake to ensure pangeo-forge-recipes is installed before the body of that command is invoked, as AFAICT that is the only command which requires pangeo-forge-recipes to be installed. I've also added a test for that feature.

@jbusecke and I are on a major in-person hack this week, and having this PR in main will be very helpful for us to move forward, so I am going to merge now, in the understanding that I have addressed your comments above. If I have not, or there are other things you notice, please tag me in a new issue and we can discuss there!

TYSM!

@cisaacstern cisaacstern merged commit 3aa4284 into main Nov 16, 2023
38 checks passed
@ranchodeluxe
Copy link
Collaborator

ranchodeluxe commented Nov 16, 2023

I am totally comfortable with incomplete support for 0.9.4 and to prevent this blocking the current PR, I'll bail out of those test cases with pytest.xfail.

thanks for xfailing, I will look at this at some point b/c they should work on 0.9.4 👍

@ranchodeluxe
Copy link
Collaborator

@cisaacstern: somehow I didn't realize this was already merged 😆

anyhow, I see plenty examples of the 0.9.4 tests passing over the last couple weeks:

https://github.com/pangeo-forge/pangeo-forge-runner/actions/runs/6669919443/job/18128713768
https://github.com/pangeo-forge/pangeo-forge-runner/actions/runs/6845625406/job/18611111514

I'll just open another pr and remove pytest.xfail and see where we're at

weiji14 added a commit to regro-cf-autotick-bot/pangeo-forge-runner-feedstock that referenced this pull request Nov 21, 2023
weiji14 added a commit to conda-forge/pangeo-forge-runner-feedstock that referenced this pull request Jan 23, 2024
* updated v0.9.2

* MNT: Re-rendered with conda-build 3.27.0, conda-smithy 3.29.0, and conda-forge-pinning 2023.11.21.15.03.38

* Drop runtime dependency on pangeo-forge-recipes

Xref pangeo-forge/pangeo-forge-runner#130

* Add fsspec to test.requires

To fix `ModuleNotFoundError: No module named 'fsspec'` when running `pangeo-forge-runner --help`.

* Remove apache-beam from runtime dependencies

Following pangeo-forge/pangeo-forge-runner#90. Also sort dependency list alphabetically.

* Add apache-beam to test.requires

Fixes `ModuleNotFoundError: No module named 'apache_beam'` when running `pangeo-forge-runner --help`.

* MNT: Re-rendered with conda-build 3.27.0, conda-smithy 3.30.4, and conda-forge-pinning 2024.01.22.14.29.27

---------

Co-authored-by: Wei Ji <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test-dataflow Add this label to PRs to trigger Dataflow integration test. test-flink Add this label to PRs to trigger Dataflow integration test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants