Add ability to resume execution of DQD checks #411

ganisimov · 2022-12-19T11:23:26Z

It happens that due to connection issues or other conditions executeDqChecks fails and progress is lost.

When new resume argument is set to TRUE, file in outputFile path is used as source of check results instead of actual processing. Missing check results or results reporting error are re-processed.

Note that outputFile must be set explicitely to make resuming to take effect. New results overwrite file in outputFile path.

Addresses #109

It happens that due to connection issues or other conditions executeDqChecks fails and progress is lost. When new resume argument is set to TRUE, file in outputFile path is used as source of check results instead of actual processing. Missing check results or results reporting error are re-processed. Note that outputFile must be set explicitely to make resuming to take effect. New results overwrite file in outputFile path. Addresses OHDSI#109

MaximMoinat

This is a great addition, thanks. See my comment in-line.

Two general remarks:

To be consistent with other OHDSI packages, I propose to either call this incremental mode like in CohortDiagnostics or add a new function runMissingChecks like in Achilles. @katy-sadowski Maybe something to discuss in an upcoming WG meet?
Does this also work if a fatal error occurs? (e.g. connection lost, or R session timing out). I think the json results object only gets written at the end, so we might need to write out intermediate results to make this work for all cases where a DQD execution might be interrupted.

MaximMoinat · 2023-12-28T12:23:51Z

R/runCheck.R

+            conceptId = check["conceptId"],
+            unitConceptId = check["unitConceptId"]
+          )
+          checkResultCandidates <- checkResultsSaved %>% dplyr::filter(checkId == currentCheckId & is.na(ERROR))


This assumes the ERROR field is present in the results object, which is not the case when none of the checks resulted in an error.

katy-sadowski · 2023-12-30T02:15:06Z

I 100% want to add this feature, thanks @ganisimov for the contribution and @MaximMoinat for reviving the thread 😃

I've been thinking, however, that we might want to wait to add it in as part of the larger check-running workflow overhaul we've been discussing implementing in 2024. Part of the vision for that workflow includes moving away from the current "all-or-nothing" execution towards something more incremental that's aware of the relationships/dependencies between checks. I think that implementation will lend itself better to figuring out how/where to store incremental results, how to handle various failure modes, etc.

Would definitely love to discuss this in a WG meeting early next year!

MaximMoinat requested changes Dec 28, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to resume execution of DQD checks #411

Add ability to resume execution of DQD checks #411

ganisimov commented Dec 19, 2022 •

edited

Loading

MaximMoinat left a comment •

edited

Loading

MaximMoinat Dec 28, 2023

katy-sadowski commented Dec 30, 2023

Add ability to resume execution of DQD checks #411

Are you sure you want to change the base?

Add ability to resume execution of DQD checks #411

Conversation

ganisimov commented Dec 19, 2022 • edited Loading

MaximMoinat left a comment • edited Loading

Choose a reason for hiding this comment

MaximMoinat Dec 28, 2023

Choose a reason for hiding this comment

katy-sadowski commented Dec 30, 2023

ganisimov commented Dec 19, 2022 •

edited

Loading

MaximMoinat left a comment •

edited

Loading