Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make an option to run task groups sequentially instead of in-parallel #37

Open
Kipok opened this issue Sep 10, 2024 · 1 comment
Open
Assignees

Comments

@Kipok
Copy link
Collaborator

Kipok commented Sep 10, 2024

Would be great to add "sequential" option to task groups, such that they can be run one after another instead of in-parallel. The usecase for this is that sometimes we need to run a couple of conversion steps, e.g. from nemo -> hf and then from hf -> trtllm and these steps need different containers. We also sometimes need to run inference and then do evaluation in different containers where evaluation is only taking a small time. Currently we run those steps by launching separate tasks, which will request separate nodes and thus might stay in slurm queue longer. Would be great to have an option to have multiple sequential sruns in the same sbatch script so that we can use different containers, but reuse the node.

@hemildesai hemildesai self-assigned this Nov 5, 2024
@hemildesai
Copy link
Collaborator

I will pick this up next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants