You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the preempt queue, maximum per user is 200.
For normal/serial there's a maximum of 2000 total, 400 per user.
I can keep track of how many jobs that brlife has submitted, but not the absolute max (2000).
I believe that, abcd hook that's installed on each resource should do this check but amaretti currently doesn't re-try starting job in case of start hook failure (job is set to failed). Maybe I should reconsider this and make it to keep retrying? If we do, then we won't need to do any queue size checking - it will just keep retrying qsub until succeeds.
I need to think through the side-effect of keep retrying startup hook, however. I feel that it could create more problem than it solves.
The text was updated successfully, but these errors were encountered:
PBS has ridiculously small job queue..
From Jeff Gronek
I can keep track of how many jobs that brlife has submitted, but not the absolute max (2000).
I believe that, abcd hook that's installed on each resource should do this check but amaretti currently doesn't re-try starting job in case of start hook failure (job is set to failed). Maybe I should reconsider this and make it to keep retrying? If we do, then we won't need to do any queue size checking - it will just keep retrying qsub until succeeds.
I need to think through the side-effect of keep retrying startup hook, however. I feel that it could create more problem than it solves.
The text was updated successfully, but these errors were encountered: