How to properly handle (multiple) restarts? #1587
Replies: 2 comments 7 replies
-
Hi @tomdemeyere. Thanks for your question. This is definitely something I would like to be able to have in quacc directly, but I have not had time to sort out a mechanism that would achieve this. You're right that various workflow engines (e.g. Covalent, Prefect) do have options for restarts, including with parameter updates. Although even in those cases, as you alluded to, you'll still have file shuttling issues if you want to do a "continuation" rather than a full restart, which is normally the case. I would be more than happy to consider any ideas you have in terms of a mechanism for this, ideally one that is generic (i.e. not tied to a specific recipe). |
Beta Was this translation helpful? Give feedback.
-
Alright, whether Quacc is taking care of it or not we will probably need to do it for our project anyway. We will do that on a separate fork. We will probably add error handling on the read_output as well with other parts of the code, if at some point you are interested in one / multiple specific parts ping me and I can open a PR on master. Also: do you think the ASE people would be interested in seeing an optional timeout arg for Profile or should we just create our custom Profile? |
Beta Was this translation helpful? Give feedback.
-
Very often jobs will have to be restarted because of timeout mechanisms on HPC environments. This causes two issues when using Quacc:
Currently this is blocking us to go forward on a project where we decided to switch to Quacc/Parsl to perform the calculations (before asking on the parsl-help I preferred to open something here). Is this something Quacc should do or this is solely the responsibility of the workflow engines?
Beta Was this translation helpful? Give feedback.
All reactions