Skip to content

Commit

Permalink
docs: synchronization and parallelism improvements
Browse files Browse the repository at this point in the history
Split from #13358

This is a rewrite of the docs for synchronization and parallelism,
split from the multiple mutexes and semaphores code. This PR is
suitable for backporting to 3.5 and 3.4.

* New markdown document on parallelism:
  - locks and parallelism will generally have different use cases. I
    think it is better not to have them on the same page so you don't
    think you could use one instead of the other
  - As far as I know more pages doesn't cost us anything.
  - priority for parallelism documented
  - paralellism in workflow and template spec better explained

* Synchronisation adds:
  - information on queuing
  - better explain whay a mutex is
  - explain behaviour in case of specifying both semaphore and mutex

Conform to style guide and 1 sentence per line

Signed-off-by: Alan Clucas <[email protected]>
  • Loading branch information
Joibel committed Aug 28, 2024
1 parent fed83ca commit 211cb33
Show file tree
Hide file tree
Showing 4 changed files with 77 additions and 18 deletions.
1 change: 1 addition & 0 deletions .spelling
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,7 @@ memoizing
metadata
minikube
mutex
mutexes
namespace
namespaces
natively
Expand Down
44 changes: 44 additions & 0 deletions docs/parallelism.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Limiting parallelism

You can restrict the number of workflows being executed at any time using these mechanisms.

## Controller level

You can limit the total number of workflows that can execute at any one time in the [workflow controller ConfigMap](./workflow-controller-configmap.yaml).

```yaml
data:
parallelism: "10"
```
You can also limit the number of workflows that can execute in a single namespace.
```yaml
data:
namespaceParallelism: "4"
```
Workflows that are executing but restricted from running more nodes due to the other mechanisms will still count towards the parallelism limits.
### Priority
Workflows can have a `priority` set in their specification.

Workflows with a higher priority number that have not started due to controller level parallelism will be started before lower priority workflows.

## Workflow level

You can restrict parallelism within a workflow using `parallelism` within a workflow or template.
This only restricts total concurrent executions of steps or tasks within the same workflow.

Examples

1 [`parallelism-limit.yaml`](https://github.com/argoproj/argo-workflows/blob/main/examples/parallelism-limit.yaml) restricts the parallelism of a [loop](./walk-through/loops.md)
1 [`parallelism-nested.yaml`](https://github.com/argoproj/argo-workflows/blob/main/examples/parallelism-nested.yaml) restricts the parallelism of a nested loop
1 [`parallelism-nested-dag.yaml`](https://github.com/argoproj/argo-workflows/blob/main/examples/parallelism-nested-dag.yaml) restricts the number of dag tasks that can be run at any one time
1 [`parallelism-nested-workflow.yaml`](https://github.com/argoproj/argo-workflows/blob/main/examples/parallelism-nested-workflow.yaml) shows how parallelism is inherited by children
1 [`parallelism-template-limit.yaml`](https://github.com/argoproj/argo-workflows/blob/main/examples/parallelism-template-limit.yaml) shows how parallelism of looped templates is also restricted

## Synchronization

You can use [mutexes and semaphores](./synchronization.md) to control the parallel execution of sections of a workflow.
49 changes: 31 additions & 18 deletions docs/synchronization.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,12 @@
## Introduction

Synchronization enables users to limit the parallel execution of certain workflows or
templates within a workflow without having to restrict others.
You can use synchronization to limit the parallel execution of workflows or templates.
You can use mutexes to restrict workflows or templates to only having a single concurrent section.
You can use semaphores to restrict workflows or templates to a configured number of parallel runs.
This documentation refers "locks" to mean mutexes and semaphores.

Users can create multiple synchronization configurations in the `ConfigMap` that can be referred to
from a workflow or template within a workflow. Alternatively, users can
configure a mutex to prevent concurrent execution of templates or
workflows using the same mutex.
You can create multiple synchronization configurations in the `ConfigMap` that can be referred to from a workflow or template.

For example:

Expand All @@ -24,11 +23,15 @@ data:
template: "2" # Two instances of template can run at a given time in particular namespace
```
Each synchronization block may only refer to either a semaphore or a mutex.
If you specify both only the semaphore will be locked.
### Workflow-level Synchronization
Workflow-level synchronization limits parallel execution of the workflow if workflows have the same synchronization reference.
In this example, Workflow refers to `workflow` synchronization key which is configured as limit 1,
so only one workflow instance will be executed at given time even multiple workflows created.
You can limit parallel execution of a workflow by using Workflow-level synchronization.
If multiple workflows have the same synchronization reference they will be limited by that synchronization reference.
In this example, Workflow refers to `workflow` synchronization key which is configured as limit `"1"`, so only one workflow instance will be executed at given time even if multiple workflows are created.

Using a semaphore configured by a `ConfigMap`:

Expand All @@ -52,7 +55,7 @@ spec:
args: ["hello world"]
```

Using a mutex:
Using a mutex achieves the same thing as a count `"1"` semaphore:

```yaml
apiVersion: argoproj.io/v1alpha1
Expand All @@ -74,9 +77,11 @@ spec:

### Template-level Synchronization

Template-level synchronization limits parallel execution of the template across workflows, if templates have the same synchronization reference.
In this example, `acquire-lock` template has synchronization reference of `template` key which is configured as limit 2,
so two instances of templates will be executed at a given time: even multiple steps/tasks within workflow or different workflows referring to the same template.
You can limit parallel execution of a template by using Template-level synchronization.
If templates have the same synchronization reference they will be limited by that synchronization reference, across all workflows.

In this example, `acquire-lock` template has synchronization reference of `template` key which is configured as limit `"2"` so a maximum of two instances of the `acquire-lock` template will be executed at a given time.
This applies even multiple steps or tasks within a workflow or different workflows refer to the same template.

Using a semaphore configured by a `ConfigMap`:

Expand Down Expand Up @@ -110,7 +115,7 @@ spec:
args: ["sleep 10; echo acquired lock"]
```

Using a mutex:
Using a mutex will limit to a single execution of the template at any one time:

```yaml
apiVersion: argoproj.io/v1alpha1
Expand Down Expand Up @@ -147,8 +152,16 @@ Examples:
1. [Step level semaphore](https://github.com/argoproj/argo-workflows/blob/main/examples/synchronization-tmpl-level.yaml)
1. [Step level mutex](https://github.com/argoproj/argo-workflows/blob/main/examples/synchronization-mutex-tmpl-level.yaml)

### Other Parallelism support
### Queuing

When a Workflow cannot take a lock it will be placed into a ordered queue.

Workflows can have a `priority` set in their specification.
The queue is first ordered by priority, with a higher priority number being placed before a lower priority number.
The queue is then ordered by `CreationTimestamp` of the Workflow; older Workflows will be ordered before newer workflows.

Workflows are only be allowed to take a lock if they are at the front of the queue for that lock.

## Parallelism

In addition to this synchronization, the workflow controller supports a parallelism setting that applies to all workflows
in the system (it is not granular to a class of workflows, or tasks withing them). Furthermore, there is a parallelism setting
at the workflow and template level, but this only restricts total concurrent executions of tasks within the same workflow.
See also [how you can restrict parallelism](./parallelism.md) in other ways.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ nav:
- retries.md
- lifecyclehook.md
- synchronization.md
- parallelism.md
- memoization.md
- template-defaults.md
- enhanced-depends-logic.md
Expand Down

0 comments on commit 211cb33

Please sign in to comment.