Improve handling of invalid experiments in GUI #9589

jonathan-eq · 2024-12-18T13:40:03Z

Issue
Resolves #9493

Approach
The commit in this PR:

Makes storage re-validate when changing between ert modes (manage experiment, run experiment, plot experiment), and emit storage_changed if validity has changed.
Disables ensembles with invalid experiments from ensemble_selectors (error is shown as tooltip on hover)
Filters out invalid experiments from dark storage, so plotter won't attempt to plot it.
Reloads storage and re-validates on end of experiment, so ert wont crash if responses.json is deleted mid-run.

(Screenshot of new behavior in GUI if applicable)

(This is when the responses.json file is deleted mid run. Prior to this PR, ert would crash with ValueError: responses.json does not exist)

(This is when responses.json is deleted, and we open the plotter. The ensemble exists, but is not given as a option to plot due to it having an invalid experiment)

(This is when responses.json is deleted and we open manage-experiment. The experiment is shown, but cannot be selected. It is greyed out, and gives error as a tooltip when hovered)

PR title captures the intent of the changes, and is fitting for release notes.
Added appropriate release note label
Commit history is consistent and clean, in line with the contribution guidelines.
Make sure unit tests pass locally after every commit (git rebase -i main --exec 'pytest tests/ert/unit_tests -n logical -m "not integration_test"')

When applicable

When there are user facing changes: Updated documentation
New behavior or changes to existing untested code: Ensured that unit tests are added (See Ground Rules).
Large PR: Prepare changes in small commits for more convenient review
Bug fix: Add regression test for the bug
Bug fix: Create Backport PR to latest release

codspeed-hq · 2024-12-18T14:06:15Z

CodSpeed Performance Report

Merging #9589 will not alter performance

_{Comparing jonathan-eq:fix-missing_responses_error (8df6b4e) with main (1199e58)}

Summary

✅ 25 untouched benchmarks

codecov-commenter · 2024-12-18T15:02:45Z

Codecov Report

Attention: Patch coverage is 80.00000% with 15 lines in your changes missing coverage. Please review.

Project coverage is 91.83%. Comparing base (01156bb) to head (869f845).
Report is 20 commits behind head on main.

Files with missing lines	Patch %	Lines
.../ert/gui/tools/manage_experiments/storage_model.py	77.77%	6 Missing ⚠️
src/ert/gui/ertwidgets/ensembleselector.py	66.66%	4 Missing ⚠️
src/ert/storage/local_experiment.py	82.35%	3 Missing ⚠️
src/ert/storage/local_ensemble.py	50.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #9589      +/-   ##
==========================================
- Coverage   91.85%   91.83%   -0.03%     
==========================================
  Files         433      433              
  Lines       26768    26879     +111     
==========================================
+ Hits        24587    24683      +96     
- Misses       2181     2196      +15

Flag	Coverage Δ
cli-tests	`39.69% <22.66%> (-0.07%)`	⬇️
everest-models-test	`34.55% <18.66%> (-0.04%)`	⬇️
gui-tests	`72.13% <77.33%> (+0.02%)`	⬆️
integration-test	`38.71% <22.66%> (+1.51%)`	⬆️
performance-tests	`51.88% <36.00%> (-0.06%)`	⬇️
test	`40.02% <22.66%> (-0.43%)`	⬇️
unit-tests	`74.06% <48.00%> (-0.11%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/ert/gui/tools/manage_experiments/storage_model.py

src/ert/gui/ertnotifier.py

andreas-el · 2025-01-08T11:03:54Z

src/ert/gui/ertwidgets/ensembleselector.py

+                and not ensemble.experiment.is_valid()
+            ):
+                index = self.count() - 1
+                model_item = model.item(index)


Does this work? Not itemData(index) ?
https://doc.qt.io/qtforpython-5/PySide2/QtCore/QAbstractItemModel.html#PySide2.QtCore.PySide2.QtCore.QAbstractItemModel.itemData
https://doc.qt.io/qt-6/qabstractitemmodel.html#itemData

Yes, it works well. Should I use ItemData instead?

Not sure. I looked into this due to style/linting errors. If the error was resolved, this is probably fine.

src/ert/gui/tools/manage_experiments/storage_model.py

src/ert/gui/tools/manage_experiments/storage_widget.py

jonathan-eq · 2025-01-14T12:51:46Z

src/ert/gui/main_window.py

@@ -154,6 +154,7 @@ def right_clicked(self) -> None:
    def select_central_widget(self) -> None:
        actor = self.sender()
        if actor:
+            self.notifier.refresh()


This refreshes storage to make sure the experiment files (responses, index, metadata, and parameters) are still valid.

We could consider just re-running validation, and refresh if that fails.

I updated it so that if will only actually refresh when the validity has changed, when toggling between the panels. This also means that the storage won't be updated if new experiments are added and we toggle between the panels, but that is the same behavior as on main.

jonathan-eq · 2025-01-14T13:36:02Z

There is a bug where it crashes if you are already in the manage experiment window, tab out to delete the responses.json file, and then put focus back on the ert window. I am trying to have ert refresh storage when gaining focus, but reimplementing focusInEvent on the main window does not work (it is never called). Is there a way I can make this work so that I can force refresh/revalidate storage before the child widgets start selecting invalid experiments @andreas-el?

jonathan-eq · 2025-01-21T09:02:44Z

This PR contains the initial work towards making this part of the storage more robust. There is still an issue where ert crashes if you have selected an experiment in the manage-experiment panel, tab out and delete responses.json, and tab back in. It seems like the SelectionModel for the QListView tries reselecting the experiment (which is now invalid) before it has been revalidated.

xjules · 2025-01-21T15:01:00Z

src/ert/gui/tools/manage_experiments/storage_info_widget.py

@@ -94,6 +94,8 @@ def __init__(self) -> None:

    @Slot(Experiment)
    def setExperiment(self, experiment: Experiment) -> None:
+        if not experiment.is_valid():


Can this even happen?

Yes, when the selected experiment in manage-experiment becomes invalid while still selected. Apparently, when you tab out of ert while an experiment is selected, and tab in again; the same experiment is re-selected, and this signal is emitted. Very fun to debug 🔮

xjules · 2025-01-22T08:37:10Z

src/ert/gui/tools/manage_experiments/storage_model.py

@@ -128,7 +129,9 @@ def data(
                qapp = QApplication.instance()
                assert isinstance(qapp, QApplication)
                return qapp.palette().mid()
-
+        elif role == Qt.ItemDataRole.ToolTipRole:
+            if self._error:


Is is the same error shown twice?

I am removing the one not on experimentmodel.

xjules · 2025-01-22T08:40:46Z

src/ert/gui/tools/manage_experiments/storage_model.py

+    @override
+    def hasChildren(self, parent: QModelIndex | None = None) -> bool:
+        if parent is None or not parent.isValid():
+            return True


shouldn't this be False? At least it sounds like it, but it might be my lack of understanding.

A person/item who has no parents, or not any valid parents, are guaranteed to have children?
I'm so confused here.

It does not work without that line 😿
Is it that this is the topmost model (will never have a parent), and will always have children; but the number of children is dependent on the number of experiments?

What if you have no experiments? Is this still correct? I think you need to provide some comment or explanation here, since we don't understand why this is so.

I think hasChildren is there only to check if the tree can be expanded. In our case, the root node (StorageModel) will always be expanded, but the number of rows (ExperimentModels) will determine how much it will be expanded. I can change it to return len(self._children) > 0, but hardcoding it to true would probably be fine too.

andreas-el · 2025-01-22T08:51:06Z

I will argue that you should alter the title of this PR. The PR does not specifically target the missing file, but rather handling invalid experiments.

jonathan-eq · 2025-01-22T09:37:42Z

tests/ert/ui_tests/gui/test_main_window.py

+
+    def check_plot_tool(expected_number_of_cases: int) -> None:
+        find_and_click_button("button_Create_plot")
+        # Due to the fact that we create new instances of PlotWindow on tab change, QtBot is defaulting to the first child


I have to test the plotter as the missing responses.json bug also occurred there

jonathan-eq · 2025-01-22T09:43:15Z

I will argue that you should alter the title of this PR. The PR does not specifically target the missing file, but rather handling invalid experiments.

Yes, I was hit by scope creep...

andreas-el · 2025-01-22T09:46:03Z

src/ert/storage/local_experiment.py

+    def _validate_files(self) -> None:
+        self.valid_parameters = (self._path / self._parameter_file).exists()
+        self.valid_responses = (self._path / self._responses_file).exists()
+        self.valid_metadata = (self._path / self._metadata_file).exists()
+
+    def is_valid(self) -> bool:
+        return self.valid_parameters and self.valid_responses and self.valid_metadata


Does it make sense to combine these, and check for existence of files in is_valid ?

Then it becomes more difficult to see if the validity of a specific experiment has changed. I think that is why I chose to have it in two separate steps.

andreas-el · 2025-01-22T13:20:05Z

src/ert/gui/ertnotifier.py

+        self._storage.refresh()
+        self.storage_changed.emit(self._storage)


Should this emit regardless of outcome?

We only call this method when we know it will be changed when refreshing. Either if the validity of an experiment has changed; or if an experiment has finished/started running

But if you call the refresh function, this will emit regardless. I see that your statement is true when using revalidate_storage, due to the check there that looks at state changes.

Yes, but we only call it when we already know something has changed, this check is done upstream in both places where it's used 🤔

andreas-el · 2025-01-22T13:24:13Z

src/ert/gui/ertwidgets/ensembleselector.py

        show_only_undefined: bool = False,
        show_only_no_children: bool = False,
+        show_only_with_valid_experiment: bool = False,


Are these flags mutually exclusive? I.e. only one is ever set to true?
This could have been converted to an enum if so.

No, they are two completely separate filters

I think we should spend a couple of minutes looking at this to see if we can alter this. I think this filtering thing have grown outside reasonable scope of multiple booleans.

I think we need all those filters, because this component is re-used with a lot of different context

andreas-el · 2025-01-22T13:28:31Z

src/ert/gui/tools/manage_experiments/manage_experiments_panel.py

+            show_only_undefined=True,
+            show_only_with_valid_experiment=True,


Show only valid and undefined ?

They are separate, so it think they should have their own flags.

show_only_undefined is for the ensembles in the experiment; whilst the other flag is for the experiment itself.

xjules · 2025-01-24T11:45:27Z

Not sure if it is related but this one hangs: https://github.com/equinor/ert/actions/runs/12924655010/job/36044056095?pr=9589

andreas-el · 2025-01-28T13:45:35Z

src/ert/gui/ertwidgets/ensembleselector.py

            self.addItem(
                f"{ensemble.experiment.name} : {ensemble.name}", userData=ensemble
            )
+            if (
+                self._show_only_with_valid_experiment


Do you really need to check for _show_only_with_valid_experiment here?
Does it not make sense to just do this for all experiments that are not valid?

This commit improves the handling of invalid experiments in the gui, in the case of missing responses.json file. The handling of the missing file should in the future be extended to also handle the other experiment files in a similar manner (index.json, metadata.json, and parameters.json) This commit: * Makes storage reload and re-validate when changing between ert modes (manage experient, run experiment, and plot tool) * Disables ensemble with invalid experiments from ensemble_selectors (error is shown as tooltip on hover) * Filters out invalid experiments from dark storage, so plotter won't attempt to plot them. * Reloads and re-validates storage on end of experiment, so ert won't crash if responses.json is deleted mid-run.

jonathan-eq force-pushed the fix-missing_responses_error branch from d644aee to 2fe1d74 Compare December 18, 2024 14:30

andreas-el reviewed Jan 8, 2025

View reviewed changes

src/ert/gui/tools/manage_experiments/storage_model.py Show resolved Hide resolved

andreas-el reviewed Jan 8, 2025

View reviewed changes

src/ert/gui/ertnotifier.py Outdated Show resolved Hide resolved

andreas-el reviewed Jan 8, 2025

View reviewed changes

src/ert/gui/tools/manage_experiments/storage_model.py Outdated Show resolved Hide resolved

andreas-el reviewed Jan 8, 2025

View reviewed changes

src/ert/gui/tools/manage_experiments/storage_model.py Show resolved Hide resolved

jonathan-eq commented Jan 13, 2025

View reviewed changes

src/ert/gui/tools/manage_experiments/storage_widget.py Outdated Show resolved Hide resolved

jonathan-eq force-pushed the fix-missing_responses_error branch 2 times, most recently from abb616c to d6e1bdd Compare January 14, 2025 08:55

jonathan-eq commented Jan 14, 2025

View reviewed changes

jonathan-eq force-pushed the fix-missing_responses_error branch 3 times, most recently from 61e01c7 to 9203cd3 Compare January 21, 2025 08:55

jonathan-eq marked this pull request as ready for review January 21, 2025 08:56

jonathan-eq requested a review from andreas-el January 21, 2025 09:14

jonathan-eq added release-notes:bug-fix Automatically categorise as bug fix in release notes release-notes:improvement Automatically categorise as improvement in release notes and removed release-notes:bug-fix Automatically categorise as bug fix in release notes labels Jan 21, 2025

xjules reviewed Jan 21, 2025

View reviewed changes

xjules reviewed Jan 22, 2025

View reviewed changes

jonathan-eq commented Jan 22, 2025

View reviewed changes

andreas-el reviewed Jan 22, 2025

View reviewed changes

jonathan-eq force-pushed the fix-missing_responses_error branch from 9203cd3 to 53e9bac Compare January 22, 2025 14:00

jonathan-eq changed the title ~~Improve handling of missing responses.json~~ Improve handling of invalid experiments in GUI Jan 23, 2025

jonathan-eq force-pushed the fix-missing_responses_error branch from 53e9bac to 033426e Compare January 23, 2025 07:54

jonathan-eq force-pushed the fix-missing_responses_error branch 2 times, most recently from 32addbb to 4cbc09d Compare January 27, 2025 11:46

jonathan-eq requested a review from andreas-el January 27, 2025 12:47

andreas-el reviewed Jan 28, 2025

View reviewed changes

jonathan-eq force-pushed the fix-missing_responses_error branch from 92032e8 to 8df6b4e Compare January 28, 2025 14:14

		self._storage.refresh()
		self.storage_changed.emit(self._storage)

		show_only_undefined=True,
		show_only_with_valid_experiment=True,

Improve handling of invalid experiments in GUI #9589

Are you sure you want to change the base?

Improve handling of invalid experiments in GUI #9589

Conversation

jonathan-eq commented Dec 18, 2024 • edited Loading

When applicable

codspeed-hq bot commented Dec 18, 2024 • edited Loading

Merging #9589 will not alter performance

Summary

codecov-commenter commented Dec 18, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathan-eq commented Jan 14, 2025

jonathan-eq commented Jan 21, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreas-el commented Jan 22, 2025

Choose a reason for hiding this comment

jonathan-eq commented Jan 22, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xjules commented Jan 24, 2025

Choose a reason for hiding this comment

jonathan-eq commented Dec 18, 2024 •

edited

Loading

codspeed-hq bot commented Dec 18, 2024 •

edited

Loading

codecov-commenter commented Dec 18, 2024 •

edited

Loading