-
Notifications
You must be signed in to change notification settings - Fork 88
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Force nodes to down and power_save when stopping the cluster
If a cluster is stopped while a node is powering-up (alloc#-idle#), node is kept in the powering-up state on cluster start. This makes the node unavailable for the entire ResumeTimeout which is 60 minutes. Slurm is ignoring the transition to power_down if we don't put the node to down first. From @demartinofra ## Manual test * Created a cluster and submitted a job on it * When the node was powering up stopped the cluster and verified the node is correctly marked as power down * Restarted the cluster and verified the node is back to powering save state (after about 2 minutes) * Job ran correctly in the new node. Signed-off-by: Enrico Usai <[email protected]>
- Loading branch information
1 parent
3489e16
commit fe0c2a0
Showing
3 changed files
with
15 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters