You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been working in the background to have pytest collect code coverage on Windows too, and since the time impact is a bit steep, I worked to have tests running with multiple workers (pytest-xdist) like our other platforms.
I found some unexpected behaviour once I got this working (expect for failures), that seems to have been going unnoticed. A test fixture to create a temporary project and passing env variables as a dict doesn't seem to work correctly.
The pattern I observed is that the first call/test to compiled module, per xdist runner (each have a separate process, and temporary directory(
), will fail with ERROR: No active GRASS session: GISRC environment variable not set but subsequent tests on the same worker will be fine.
Originally, I found that on every run, general/g.version/tests/g_version_test.py::test_c_flag would fail with the error above, but then general/g.version/tests/g_version_test.py::test_e_flag and the others wouldn't fail. To be clear, there's no reason why printing out a static copyright text would ever need to fail because of missing env vars or config files. And there's no reason that only the c flag should fail at all for g.version, and not others at the same time. So it's a good example for the issue here. It highlights an isolation problem, or where one test impacts other subsequent ones. In these first runs, I never saw a problem with the first worker (gw0), as the first tests to run (alphabetically) are expected to fail on non-Linux.
I first tried to see if initializing the fixture at each test instead of once per module. Then looking inside the gs.create_project(tmp_path, project) and with gs.setup.init(tmp_path / project, env=os.environ.copy()) as session: to see if a env copy was missing or something was permanently changing an env var when running a first time. I also tried using monkey patch to delete and replace back the GISBASE env var to see if I get another message, as this one too is used early on. It didn't seem to change anything.
So following that intuition, I added a plugin to shuffle the order of tests. If it was the first call that even if failing, would "fix" the setup for subsequent tests, I should see failures at the beginning for each worker, and then be passing after that. And it is what is happening. I also get different tests failing between each rerun of the same workflow.
For now, I didn't try adding the random plugin to other OSes to see if there's an impact.
I also didn't explore to see if the differences between our Windows builds on CI and installing from the OSGeo4W grass-dev packages also impacted this, as the initial goal was to make our CI work. (I'm waiting for #4121 to cross the finish line before sending in my work that independently did the same and changed the CI to use the same "official" script, and the same used for releases. Work is done, can use the ccache already scripted, and uploads the compiled package as an artifact if we wanted to test it locally or split the test execution in a separate job).
To reproduce
Change the Windows CI workflow to install pytest-xdist and pytest-randomly with pip in the OSGeo4W shell.
Similar to our other workflows, run pytest with multiple workers and for tests not having the needs_solor_run marker by using the --numprocesses auto and -m "not needs_solo_run" (notice the double quotes for cmd.exe)
See failures.
Try without the pytest-randomly plugin to see the c_flag (copyright) test fail, the original problem.
Of course there's pytest related, but I'm not sure how to find out what you're asking for. Do you have an idea of how to trigger something similar, and see that the cleanup isn't done?
Describe the bug
I've been working in the background to have pytest collect code coverage on Windows too, and since the time impact is a bit steep, I worked to have tests running with multiple workers (pytest-xdist) like our other platforms.
I found some unexpected behaviour once I got this working (expect for failures), that seems to have been going unnoticed. A test fixture to create a temporary project and passing env variables as a dict doesn't seem to work correctly.
The pattern I observed is that the first call/test to compiled module, per xdist runner (each have a separate process, and temporary directory(
), will fail with
ERROR: No active GRASS session: GISRC environment variable not set
but subsequent tests on the same worker will be fine.Originally, I found that on every run,
general/g.version/tests/g_version_test.py::test_c_flag
would fail with the error above, but thengeneral/g.version/tests/g_version_test.py::test_e_flag
and the others wouldn't fail. To be clear, there's no reason why printing out a static copyright text would ever need to fail because of missing env vars or config files. And there's no reason that only the c flag should fail at all for g.version, and not others at the same time. So it's a good example for the issue here. It highlights an isolation problem, or where one test impacts other subsequent ones. In these first runs, I never saw a problem with the first worker (gw0), as the first tests to run (alphabetically) are expected to fail on non-Linux.I first tried to see if initializing the fixture at each test instead of once per module. Then looking inside the
gs.create_project(tmp_path, project)
andwith gs.setup.init(tmp_path / project, env=os.environ.copy()) as session:
to see if a env copy was missing or something was permanently changing an env var when running a first time. I also tried using monkey patch to delete and replace back the GISBASE env var to see if I get another message, as this one too is used early on. It didn't seem to change anything.So following that intuition, I added a plugin to shuffle the order of tests. If it was the first call that even if failing, would "fix" the setup for subsequent tests, I should see failures at the beginning for each worker, and then be passing after that. And it is what is happening. I also get different tests failing between each rerun of the same workflow.
For now, I didn't try adding the random plugin to other OSes to see if there's an impact.
I also didn't explore to see if the differences between our Windows builds on CI and installing from the OSGeo4W grass-dev packages also impacted this, as the initial goal was to make our CI work. (I'm waiting for #4121 to cross the finish line before sending in my work that independently did the same and changed the CI to use the same "official" script, and the same used for releases. Work is done, can use the ccache already scripted, and uploads the compiled package as an artifact if we wanted to test it locally or split the test execution in a separate job).
To reproduce
needs_solor_run
marker by using the--numprocesses auto
and-m "not needs_solo_run"
(notice the double quotes for cmd.exe)Expected behavior
Screenshots
Original run:
Three reruns with random plugin installed, the third one is the most interesting at it has 16 failures instead of one or two:
System description
Pytest version and plugins shown in the screenshots and action logs
Additional context
The text was updated successfully, but these errors were encountered: