Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
EdanToledo authored Feb 10, 2025
1 parent 865291b commit f6534f9
Showing 1 changed file with 22 additions and 0 deletions.
22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,28 @@ or if you wanted to do dueling C51, you could do:
python stoix/systems/q_learning/ff_c51.py network=mlp_dueling_c51
```

### SLURM Launcher for Distributed Experiments

For users with access to SLURM clusters, Stoix includes a lightweight flexible SLURM launcher that leverages Hydra and [submitit](https://github.com/facebookincubator/submitit) to simplify running large-scale experiments in parallel. This launcher lets you easily specify different algorithms (by their execution file), environments (by configs), and seeds --- and it automatically submits a separate job for each combination.

**Key features include:**

- **Hydra Integration:** Configure your experiments and SLURM parameters (such as time, partition, memory, etc.) in a single Hydra config file or override them on the fly from the command line.
- **Parallel Job Submission:** Automatically submits individual SLURM jobs for each experiment combination, enabling you to distribute your workload across multiple nodes and GPUs.
- **Resource Flexibility:** Easily customize your SLURM resource requirements (e.g., nodes, GPUs per node, cpus per task) to best suit your hardware and research needs.

**Usage Example:**

To launch your experiments on a SLURM cluster with a custom time limit and partition, run:

```bash
python launcher.py slurm.time=00:30:00 slurm.partition=gpu
```

In this example, the launcher uses the provided overrides to set the job timeout to 30 minutes and submit jobs to the GPU partition. For more advanced configurations, simply adjust or add more overrides as needed.

This SLURM launcher is designed to assist with research, making it simple to scale up your experiments without having to write large batch scripts manually.

## Important Considerations

1. If your environment does not have a timestep limit or is not guaranteed to end through some game mechanic, then it is possible for the evaluation to seem as if it is hanging forever thereby stalling the training but in fact your agent is just so good _or bad_ that the episode never finishes. Keep this in mind if you are seeing this behaviour. One solution is to simply add a time step limit or potentially action masking.
Expand Down

0 comments on commit f6534f9

Please sign in to comment.