Skip to content

Commit

Permalink
feat(config): provide function to dynamically assign resources based …
Browse files Browse the repository at this point in the history
…on number of attempts (#380)

* feat(config):  provide function to dynamically assign resources based on number of attempts

* fix(quickstart): seperate lines between git clone and cd

* fix(config): fix typos in cpus and memory assignment function
  • Loading branch information
pbelmann authored Nov 21, 2024
1 parent a133ee0 commit 876319a
Show file tree
Hide file tree
Showing 5 changed files with 99 additions and 3 deletions.
19 changes: 19 additions & 0 deletions docs/developer_guidelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,25 @@ tuple env(FILE_ID), val("${output}"), val(params.LOG_LEVELS.INFO), file(".comman
file(".command.out"), file(".command.err"), file(".command.log"), emit: logs
```


### Resources

Usually all processes get a label that is linked to the flavor provided in the resources section of the configuration file (e.g. `small`, `medium`, `large`).
In cases where we can foresee that a processes might need more RAM depending on the size of the input data, it is also possible to use
the getMemoryResources and getCPUsResources methods provided by the Utils class:

Example:

```
memory { Utils.getMemoryResources(params.resources.small, "${sample}", task.attempt, params.resources) }
cpus { Utils.getCPUsResources(params.resources.tiny.small, "${sample}", task.attempt, params.resources) }
```

One example where it might make sense to use this option is when a process uses `csvtk join` and uses per default a label with
little RAM assigned.


### Time Limit

Every process must define a time limit which will never be reached on "normal" execution. This limit is only useful for errors in the execution environment
Expand Down
73 changes: 73 additions & 0 deletions lib/Utils.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -147,4 +147,77 @@ class Utils {
return 90
}
}

/*
*
* This method returns the number of vCPUs according to the output of the getResources method.
*
*/
static Integer getCPUsResources(defaults, sample, attempt, resources){
return getResources(defaults, sample, attempt, resources)["cpus"]
}


/*
*
* This method returns RAM value according to the output of the getResources method.
*
*/
static String getMemoryResources(defaults, sample, attempt, resources){
return getResources(defaults, sample, attempt, resources)["memory"] + ' GB'
}

/*
*
* This method returns the flavor with more RAM provided in the resources section
* of the configuration file.
*
* defaults: This variable contains the resources label object provided to the process (e.g. small, medium, ...).
* sample: This variable contains the sample name.
* attempt: This variable contains the number of attempts to run the process.
* resources: This variable represents all resources provied in the configuration file.
*
*/
static Map<String, Integer> getResources(defaults, sample, attempt, resources){

// get map of memory values and flavor names (e.g. key: 256, value: large)
def memoryLabelMap = resources.findAll().collectEntries( { [it.value.memory, it.key] });

// Get map of resources with label as key
def labelMap = resources.findAll()\
.collectEntries( { [it.key, ["memory" : it.value.memory, "cpus" : it.value.cpus ] ] })

// get memory values as list
def memorySet = memoryLabelMap.keySet().sort()

// If the memory value is not predicted do the following
def defaultCpus = defaults.cpus

def defaultMemory = defaults.memory

def defaultMemoryIndex = memorySet.findIndexOf({ it == defaultMemory })

def nextHigherMemoryIndex = defaultMemoryIndex + attempt - 1


if(nextHigherMemoryIndex >= memorySet.size()){
// In case it is already the highest possible memory setting
// then try the label with the highest memory
println("Warning: Highest possible memory setting " \
+ " of the dataset " + sample + " is already reached." \
+ " Retry will be done using a flavor with the highest possible RAM configuration")

def label = memoryLabelMap[memorySet[memorySet.size()-1]]
def cpus = labelMap[label]["cpus"]
def ram = labelMap[label]["memory"]

return ["cpus": cpus, "memory": ram];
} else {
def label = memoryLabelMap[memorySet[nextHigherMemoryIndex]];
def cpus = labelMap[label]["cpus"];
def memory = labelMap[label]["memory"];

return ["cpus": cpus, "memory": memory];
}
}
}
4 changes: 3 additions & 1 deletion modules/annotation/module.nf
Original file line number Diff line number Diff line change
Expand Up @@ -647,7 +647,9 @@ process pCount {
**/
process pCollectFile {

label 'highmemMedium'
memory { Utils.getMemoryResources(params.resources.small, "${sample}", task.attempt, params.resources) }

cpus { Utils.getCPUsResources(params.resources.small, "${sample}", task.attempt, params.resources) }

tag "Sample: $sample, Database: $dbType"

Expand Down
4 changes: 3 additions & 1 deletion modules/binning/processes.nf
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@ process pGetBinStatistics {

publishDir params.output, mode: "${params.publishDirMode}", saveAs: { filename -> getOutput("${sample}", params.runid, "${module}", "${binner}", filename) }

label 'tiny'
memory { Utils.getMemoryResources(params.resources.tiny, "${sample}", task.attempt, params.resources) }

cpus { Utils.getCPUsResources(params.resources.tiny, "${sample}", task.attempt, params.resources) }

input:
val(module)
Expand Down
2 changes: 1 addition & 1 deletion scripts/test_quickstart.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# This file will be referenced on the online wiki

git clone [email protected]:metagenomics/metagenomics-tk.git \
git clone [email protected]:metagenomics/metagenomics-tk.git
cd metagenomics-tk

make nextflow
Expand Down

0 comments on commit 876319a

Please sign in to comment.