Make process resource requirements depend dynamically on input file properties #6

jackscanlan · 2024-07-22T02:10:10Z

At the moment processes request resources (CPU, memory and time) based on the Nextflow label system, which works well for processes that have fixed resource requirements largely regardless of inputs (eg. PARSE_INPUTS will always require minimal resources) but doesn't work well for most others as their requirements scale based on the size of the input files, particularly read files and database files. For this pipeline, this would mostly impact time and memory requirements for each process, as CPU requirements are probably fixed for most processes.

The benefit of this would be very efficient use of HPC resources and therefore possibly faster SLURM execution runtimes.

See here for a particular (if-else) implementation, but multiplication of resources by read count, file size, database size etc. is also possible using Groovy code.

Note that different processes scale might differently -- eg. READ_FILTER time might scale linearly with the number of input reads, but a hypothetical all-vs-all alignment step might scale quadratically with the number of input sequences. Could require some playing around to find the relationships in some cases, but probably safe to first assume linearity.

The text was updated successfully, but these errors were encountered:

jackscanlan · 2024-11-14T00:19:24Z

This is becoming a noticeable issue for multi-flowcell runs, where the PHYLOSEQ_* processes can run out of memory.

One thing to do could be to look for memory-saving code cleaning (eg. removal of unneeded objects, although this makes debugging more challenging in some instances), but proportional resource allocation would also help

jackscanlan added the enhancement New feature or request label Jul 22, 2024

jackscanlan mentioned this issue Jul 24, 2024

Add multithreading to R-based processes #9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make process resource requirements depend dynamically on input file properties #6

Make process resource requirements depend dynamically on input file properties #6

jackscanlan commented Jul 22, 2024

jackscanlan commented Nov 14, 2024

Make process resource requirements depend dynamically on input file properties #6

Make process resource requirements depend dynamically on input file properties #6

Comments

jackscanlan commented Jul 22, 2024

jackscanlan commented Nov 14, 2024