Skip to content

Commit

Permalink
Refactor default actions
Browse files Browse the repository at this point in the history
* Support actions with duplicate names.
* Add: [default.action] table. Set the default for any action key (or sub-key).
* Add: [action.from] string. Copy any omitted keys from the given action.
* Add: Include operators ">=" and "<="
* Breaking: Rename "greater_than" to ">"
* Breaking: Rename "less_than" to ">"
* Breaking: Rename "equal_to" to "=="
* Breaking: [submit_options] is now [default.action.submit_options]
  • Loading branch information
joaander committed May 14, 2024
1 parent f2114a8 commit c21b459
Show file tree
Hide file tree
Showing 62 changed files with 1,604 additions and 592 deletions.
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -1 +1 @@
* joaander
* @joaander
1 change: 1 addition & 0 deletions DESIGN.md
Original file line number Diff line number Diff line change
Expand Up @@ -299,3 +299,4 @@ status may take a long time, so it should display a progress bar.
# TODO: Dependabot configuration
# TODO: readthedocs builds
# TODO: logo
# TODO: release CI
10 changes: 7 additions & 3 deletions doc/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,17 @@
- [Working with signac projects](guide/python/signac.md)
- [Writing action commands in Python](guide/python/actions.md)
- [Concepts](guide/concepts/index.md)
- [Best practices](guide/concepts/best-practices.md)
- [Process parallelism](guide/concepts/process-parallelism.md)
- [Thread parallelism](guide/concepts/thread-parallelism.md)
- [Directory status](guide/concepts/status.md)
- [JSON pointers](guide/concepts/json-pointers.md)
- [Cache files](guide/concepts/cache.md)
- [How-to](guide/howto/index.md)
- [Best practices](guide/howto/best-practices.md)
- [Set the cluster account](guide/howto/account.md)
- [Submit the same action to different groups/resources](guide/howto/same.md)
- [Summarize directory groups with an action](guide/howto/summarize.md)

# Reference

- [row](row/index.md)
Expand All @@ -34,14 +39,13 @@
- [show launchers](row/show/launchers.md)
- [scan](row/scan.md)
- [clean](row/clean.md)

- [`workflow.toml`](workflow/index.md)
- [workspace](workflow/workspace.md)
- [submit_options](workflow/submit-options.md)
- [action](workflow/action/index.md)
- [group](workflow/action/group.md)
- [resources](workflow/action/resources.md)
- [submit_options](workflow/action/submit-options.md)
- [default](workflow/default.md)
- [`clusters.toml`](clusters/index.md)
- [cluster](clusters/cluster.md)
- [Built-in clusters](clusters/built-in.md)
Expand Down
31 changes: 17 additions & 14 deletions doc/src/clusters/built-in.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,47 +4,50 @@

## Anvil (Purdue)

[Anvil documentation](https://www.rcac.purdue.edu/knowledge/anvil).

**Row** automatically selects from the following partitions:
**Row** automatically selects from the following partitions on [Anvil]:
* `shared`
* `wholenode`
* `gpu`

Other partitions may be selected manually.

There is no need to set `--mem-per-*` options on Anvil as the cluster automatically
There is no need to set `--mem-per-*` options on [Anvil] as the cluster automatically
chooses the largest amount of memory available per core by default.

## Delta (NCSA)
> Note: The whole node partitions **require** that each job submitted request an
> integer multiple of 128 CPU cores.
[Delta documentation](https://docs.ncsa.illinois.edu/systems/delta).
[Anvil]: https://www.rcac.purdue.edu/knowledge/anvil

**Row** automatically selects from the following partitions:
## Delta (NCSA)

**Row** automatically selects from the following partitions on [Delta]:
* `cpu`
* `gpuA100x4`

Other partitions may be selected manually.

Delta jobs default to a small amount of memory per core. **Row** inserts `--mem-per-cpu`
or `--mem-per-gpu` to select the maximum amount of memory possible that allows full-node
jobs and does not incur extra charges.
[Delta] jobs default to a small amount of memory per core. **Row** inserts
`--mem-per-cpu` or `--mem-per-gpu` to select the maximum amount of memory possible that
allows full-node jobs and does not incur extra charges.

## Great Lakes (University of Michigan)
[Delta]: https://docs.ncsa.illinois.edu/systems/delta

[Great Lakes documentation](https://arc.umich.edu/greatlakes/).
## Great Lakes (University of Michigan)

**Row** automatically selects from the following partitions:
**Row** automatically selects from the following partitions on [Great Lakes]:
* `standard`
* `gpu_mig40,gpu`
* `gpu`

Other partitions may be selected manually.

Great Lakes jobs default to a small amount of memory per core. **Row** inserts
[Great Lakes] jobs default to a small amount of memory per core. **Row** inserts
`--mem-per-cpu` or `--mem-per-gpu` to select the maximum amount of memory possible that
allows full-node jobs and does not incur extra charges.

> Note: The `gpu_mig40,gpu` partition is selected only when there is one GPU per job.
> This is a combination of 2 partitions which decreases queue wait time due to the
> larger number of nodes that can run your job.
[Great Lakes]: https://arc.umich.edu/greatlakes/
30 changes: 16 additions & 14 deletions doc/src/clusters/cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,10 @@ on this cluster. The table **must** have one of the following keys:
* `by_environment`: **array** of two strings - Identify the cluster when the environment
variable `by_environment[0]` is set and equal to `by_environment[1]`.
* `always`: **bool** - Set to `true` to always identify this cluster. When `false`,
this cluster can only be chosen by an explicit `--cluster` option.
this cluster may only be chosen by an explicit `--cluster` option.

> Note: The *first* cluster in the list that sets `identify.always = true` will prevent
> any later cluster from being identified.
> any later cluster from being identified (except by explicit `--cluster=name`).
## scheduler

Expand Down Expand Up @@ -87,16 +87,18 @@ will pass this option to the scheduler. For example SLURM schedulers will set

### cpus_per_node

`cluster.partition.cpus_per_node`: **string** - Number of CPUs per node. When
`cpus_per_node` is not set, **row** will ask the scheduler to schedule only a given
number of tasks. In this case, some schedulers are free to spread tasks among any
number of nodes (for example, shared partitions on Slurm schedulers).
`cluster.partition.cpus_per_node`: **string** - Number of CPUs per node.

When `cpus_per_node` is set, **row** will request the minimal number of nodes needed
to satisfy `n_nodes * cpus_per_node >= total_cpus`. This may result in longer queue
times, but will lead to more stable performance for users.
When `cpus_per_node` is not set, **row** will request `n_processes` tasks. In this case,
some schedulers are free to spread tasks among any number of nodes (for example, shared
partitions on Slurm schedulers).

Set `cpus_per_node` only when all nodes in the partition have the same number of CPUs.
When `cpus_per_node` is set, **row** will **also** request the minimal number of nodes
needed to satisfy `n_nodes * cpus_per_node >= total_cpus`. This may result in longer
queue times, but will lead to more stable performance for users.

> Note: Set `cpus_per_node` only when all nodes in the partition have the same number
> of CPUs.
### minimum_gpus_per_job

Expand Down Expand Up @@ -131,7 +133,7 @@ will pass this option to the scheduler. For example SLURM schedulers will set
### gpus_per_node

`cluster.partition.gpus_per_node`: **string** - Number of GPUs per node. Like
`cpus_per_node` but used on jobs that request GPUs.
`cpus_per_node` but used when jobs request GPUs.

### prevent_auto_select

Expand All @@ -140,6 +142,6 @@ automatically selecting this partition.

### account_suffix

`cluster.partition.account_suffix`: **string** - Set to provide an account suffix
when submitting jobs to this partition. Useful when clusters define separate
`aacount-cpu` and `account-gpu` accounts.
`cluster.partition.account_suffix`: **string** - An account suffix when submitting jobs
to this partition. Useful when clusters define separate `account-cpu` and `account-gpu`
accounts.
10 changes: 6 additions & 4 deletions doc/src/clusters/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,15 @@ name = "cluster2"
```

User-provided clusters in `$HOME/.config/row/clusters.toml` are placed first in the
array.
array. Execute [`row show cluster --all`](../row/show/cluster.md) to see the complete
cluster configuration.

## Cluster identification

On startup, **row** iterates over the array of clusters in order. If `--cluster` is not
set, **row** checks the `identify` condition in the configuration. If `--cluster` is
set, **row** checks to see if the name matches.
set, **row** checks to see if the name matches. **Row** selects the *first* cluster
that matches.

> Note: **Row** uses the *first* such match. To override a built-in, your configuration
> should include a cluster by the same name and `identify` condition.
> To override a built-in, your configuration should include a cluster by the same name
> and `identify` condition.
37 changes: 20 additions & 17 deletions doc/src/developers/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Contributions are welcomed via [pull requests on GitHub][github]. Contact the **row**
developers before starting work to ensure it meshes well with the planned development
direction and standards set for the project.
direction and follows standards set for the project.

[github]: https://github.com/glotzerlab/gsd/row

Expand All @@ -17,40 +17,44 @@ assist you in designing flexible interfaces.

Expensive code paths should only execute when requested.

### Maintain compatibility

New features should be opt-in and *preserve the behavior* of all existing user scripts.

## Version control

### Base your work off the correct branch

- Base all new work on `trunk`.
Base all bug fixes and new features on `trunk`.

### Propose a minimal set of related changes

All changes in a pull request should be closely related. Multiple change sets that are
loosely coupled should be proposed in separate pull requests.
All changes in a pull request should be *closely related*. Multiple change sets that are
loosely coupled should be proposed in *separate pull requests*.

### Agree to the Contributor Agreement

All contributors must agree to the Contributor Agreement before their pull request can
be merged.
All contributors must agree to the **Contributor Agreement** before their pull request
can be merged.

### Set your git identity

Git identifies every commit you make with your name and e-mail. [Set your identity][id]
to correctly identify your work and set it identically on all systems and accounts where
you make commits.
to correctly identify your work and set it *identically on all systems* and accounts
where you make commits.

[id]: http://www.git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup

## Source code

### Use a consistent style

The **Code style** section of the documentation sets the style guidelines for **row**
code.
Follow all guidelines outlined in the [Code style](style.md) section of the
documentation.

### Document code with comments

Use **Rust** documentation comments for classes, functions, etc. Also comment complex
Write Rust documentation comments for traits, functions, etc. Also comment complex
sections of code so that other developers can understand them.

### Compile without warnings
Expand All @@ -61,12 +65,12 @@ Your changes should compile without warnings.

### Write unit tests

Add unit tests for all new functionality.
Add unit tests for all new functionality and bug fixes.

### Validity tests
### Test validity

The developer should run research-scale simulations using the new functionality and
ensure that it behaves as intended. When appropriate, add a new test to `validate.py`.
Run research-scale simulations using new functionality and ensure that it behaves as
intended.

## User documentation

Expand All @@ -77,8 +81,7 @@ and any important user-facing change in the mdBook documentation.

### Tutorial

When applicable, update or write a new tutorial.

When applicable, update or write a new tutorial or how-to guide.

### Add developer to the credits

Expand Down
5 changes: 3 additions & 2 deletions doc/src/developers/style.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
## Rust

**Row's** rust code follows the [Rust style guide][1]. **Row's** [pre-commit][2]
configuration applies style fixes with `rustfmt` checks for common errors with `clippy`.
configuration applies style fixes with `rustfmt` and checks for common errors with
`clippy`.

[1]: https://doc.rust-lang.org/style-guide/index.html
[2]: https://pre-commit.com/
Expand All @@ -16,7 +17,7 @@ configuration applies style fixes with `rustfmt` checks for common errors with `

Wrap **Markdown** files at 88 characters wide, except when not possible (e.g. when
formatting a table). Follow layout and design patterns established in existing markdown
files.
files. Use reference-style links for long URLs.

## Spelling/grammar

Expand Down
9 changes: 6 additions & 3 deletions doc/src/developers/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,12 @@ cargo test
```
in the source directory to execute the unit and integration tests.

All tests must be marked either `#[serial]` or `#[parallel]` explicitly. Some serial
tests set environment variables and/or the current working directory, which may conflict
with any test that is automatically run concurrently. Check for this with:
## Writing unit tests

Write tests using standard Rust conventions. All tests must be marked either `#[serial]`
or `#[parallel]` explicitly. Some serial tests set environment variables and/or the
current working directory, which may conflict with any test that is automatically run
concurrently. Check for this with:
```bash
rg --multiline "#\[test\]\n *fn"
```
Expand Down
18 changes: 16 additions & 2 deletions doc/src/env.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# Environment variables

> Note: Environment variables that influence the execution of **row** are documented in
> [the command line options](row/index.md).
## In job scripts

**Row** sets the following environment variables in generated job scripts:

Expand All @@ -14,3 +13,18 @@
| `ACTION_PROCESSES_PER_DIRECTORY` | Set to the value of `action.resources.processes_per_directory`. Unset when `processes_per_submission`.|
| `ACTION_THREADS_PER_PROCESS` | Set to the value of `action.resources.threads_per_process`. Unset when `threads_per_process` is omitted. |
| `ACTION_GPUS_PER_PROCESS` | Set to the value of `action.resources.gpus_per_process`. Unset when `gpus_per_process` is omitted. |

# Set row options

Set any of these environment variables to provide default values for
[command line options].

| Environment variable | Option |
|----------------------|-------------|
| `ROW_CLEAR_PROGRESS`| --clear-progress |
| `ROW_CLUSTER` | --cluster |
| `ROW_COLOR` | --color |
| `ROW_IO_THREADS` | --io-threads |
| `ROW_NO_PROGRESS` | --no-progress |

[command line options]: row/index.md
47 changes: 0 additions & 47 deletions doc/src/guide/concepts/best-practices.md

This file was deleted.

Loading

0 comments on commit c21b459

Please sign in to comment.