-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory issue on Compute Canada #43
base: main
Are you sure you want to change the base?
Conversation
@joe-from-mtl Problem was absolutely not what I thought it was. Not a problem with the scripts or multiprocessing, the OUT OF MEMORY error is caused by nextflow process setup, where the input data is copied to the nextflow working directory. Because the input work directory is very heavy, copying it to the working directory fills up the RAM. This behaviour can be configured in the |
@joe-from-mtl However, it's a little bit weird that it still takes around 70GB from start to finish.
|
Did you have a look at the nextflow report? You can the resources required by each step of the pipeline.
|
Closes #22 .
The bug came from the process initialization in nextflow which will first copy the input files to the working directory. Because the input work directory is very heavy, copying it to the working directory fills up the RAM. This behaviour can be configured in the
nextflow.config
file withprocess.stageInMode
. I changed it from modecopy
to modesymlink
and everything works perfectly now.Works with nextflow and apptainer both locally and on a cluster.
Bonus: Refactor multiprocessing in scripts, when
n_cpus
is 1, implementation won't use a multiprocessing pool.