-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Attempts to adjust for Singularity #195
Open
jakirkham
wants to merge
10
commits into
nanshe-org:master
Choose a base branch
from
jakirkham:adj_for_singularity
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jakirkham
force-pushed
the
adj_for_singularity
branch
4 times, most recently
from
February 17, 2018 00:04
c66e9f4
to
01510bc
Compare
jakirkham
force-pushed
the
adj_for_singularity
branch
5 times, most recently
from
April 5, 2018 19:29
8d0c6cf
to
c47df22
Compare
A rough, working attempt at getting the workflow to start with Singularity. Needs a bit of cleanup to streamline the process. Leaving it here for now so it is stored and not forgotten.
Customize Dask Distributed workers to write to node local scratch space. This avoids having to deal with the slow NFS system for writing. Since workers don't need to read each others' scratch space (they will just request it over the wire) and the data is unneeded after the Dask session ends, this is a nice option to speed things up and avoid data loss due to NFS' slow performance.
Based on some experiments on the cluster, it appears that it takes anywhere from 9s-15s to startup a single worker. So set the startup cost to the average value of 12s. Given it takes a bit to startup workers, bump the interval for rechecking whether to adjust workers to 6s (leaving the number of checks the same as the default of 3). Thus the time for a job to startup and get processing some data should occur a little after if not at least by the 2nd check, which avoid rapid adjustment of workers, while still allowing it to be frequent enough to notice if things need a slight tweak. This should slow down rapid downscaling and upscaling a bit hopefully avoiding fluctuations.
jakirkham
force-pushed
the
adj_for_singularity
branch
from
May 9, 2018 19:06
2242252
to
b22954a
Compare
Missing singularity bits. Also need number of cores specified differently for dask-drmaa. Add in a few other things like scratch space, specified queue, and requesting multiple cores.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Tries to adjust the workflow for usage with Singularity on the cluster. However it needs a fair bit of cleanup before it can be merged into the workflow. Leaving it here for now so it is stored and not forgotten.