Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V13 backports #9862

Merged
merged 48 commits into from
Jan 28, 2025
Merged

V13 backports #9862

merged 48 commits into from
Jan 28, 2025

Conversation

xjules
Copy link
Contributor

@xjules xjules commented Jan 24, 2025

Issue
Backports to version-13

Approach
Short description of the approach

(Screenshot of new behavior in GUI if applicable)

  • PR title captures the intent of the changes, and is fitting for release notes.
  • Added appropriate release note label
  • Commit history is consistent and clean, in line with the contribution guidelines.
  • Make sure unit tests pass locally after every commit (git rebase -i main --exec 'pytest tests/ert/unit_tests tests/everest -n auto --hypothesis-profile=fast -m "not integration_test"')

When applicable

  • When there are user facing changes: Updated documentation
  • New behavior or changes to existing untested code: Ensured that unit tests are added (See Ground Rules).
  • Large PR: Prepare changes in small commits for more convenient review
  • Bug fix: Add regression test for the bug
  • Bug fix: Create Backport PR to latest release

eivindjahren and others added 30 commits January 24, 2025 09:22
Checking if parent process is 1 does not e.g. work on
ubuntu where the adopting process is systemd --user which
is unlikely to be 1.
This avoids always spending 120 billable cpu-seconds on every test.
Instead we trust that the queue system is correct when it claims a job
is finished, instead of waiting it out.
This commit fixes the issue where the rerun button was enabled for ES-MDA and ensemble smoother. The issue was due to us setting support_restart before calling the super class's constructor which overwrote it to False.
In a very special case zmq server might fail during initialization and all occurrences of
server_started.wait() will wait indefinitely and therefore replacing it
with asyncio.Future which provides additional exception trigger.
This serves as documentation and verification of the
current interplay between ert, the plugin configuration of
ert and the flowrun wrapper, and the flow binary itself.
Also use multiple workers since many tests do a lot
of sleeping.
This commit makes enkf_main's _value_export_json keep types when dumping the json. Before this commit, all values (also numerical) were turned to strings if used alongside categorical values.
Heartbeat task sends HEARTBEAT to all the clients (ie. Monitor) at client.HEARTBEAT_TIMEOUT intervals.
Clients do not reply, just process the message. If client detects longer delay between two heartbeats,
the client will send CONNECT to evaluator in addition; ie. getting the connection re-established after a break.
This is to simulate re-connection. Each CONNECT_MSG will then trigger sending FullSnapshot from the ensemble evaluator.
Initially HEARTBEAT_TIMEOUT is set to 5 seconds while Monitor accepts 10 seconds at max as a delay.
Additionally, initial connection will now undergo same amount of retries as standard messages.
The batching interval of 2 seconds is legacy from the time when Ert was
a mixture of Python and C, and a lot of threading issues attached. The
underlying message structure and message processing infrastructure now
handles a lot more messages, and the GUI can thus appear more responsive
to the incoming messages from compute nodes.
This test depends on a separate implementation of a flowrun executable,
and an example of this executabe just changed implementation to always
overwrite the environment variable OMP_NUM_THREADS in order to successfully
control the behaviour of flow.
Due to integration-test not being a test type any more
Dramatic speedup in test time (~10-fold)
Copy link

codspeed-hq bot commented Jan 24, 2025

CodSpeed Performance Report

Merging #9862 will not alter performance

Comparing xjules:v13_backports (ad04429) with main (6a0336a)

Summary

✅ 24 untouched benchmarks
⁉️ 1 dropped benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
⁉️ test_plotter_on_all_snake_oil_responses_time[0] 11 s N/A N/A

@xjules xjules self-assigned this Jan 28, 2025
eivindjahren and others added 16 commits January 28, 2025 15:22
The dictionary introduced in parameter configuration in design matrix config seems to be not necessary
and thus removing it. The type will those become GenKwConfig directly.
A simple test that provides either wrong or good server key.
The summation of cpu_seconds for a process and all its descendants can
never work properly during teardown of a process tree, as the root
process typically outlives its children. Thus, the maximum observed
cpu_seconds for a process tree is always the best estimate of the
correct sum.
This also makes sure that there are no lingering events when closing zmq
server socket.
There is an issue where dark storage would try to use the same port as
ensemble evaluator. Switching the port range for dark storage should
fix this problem until we can fix the bug.
Additionally, _server_started can be removed from the tasks. The commits also
adds a test for zmq start up failure when port already in use
@xjules xjules merged commit 21bfc9a into equinor:version-13.0 Jan 28, 2025
26 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants