Update README.md

EdanToledo · Feb 10, 2025 · f6534f9 · f6534f9
1 parent 865291b
commit f6534f9
Showing 1 changed file with 22 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -162,6 +162,28 @@ or if you wanted to do dueling C51, you could do:
 python stoix/systems/q_learning/ff_c51.py network=mlp_dueling_c51
 ```
 
+### SLURM Launcher for Distributed Experiments
+
+For users with access to SLURM clusters, Stoix includes a lightweight flexible SLURM launcher that leverages Hydra and [submitit](https://github.com/facebookincubator/submitit) to simplify running large-scale experiments in parallel. This launcher lets you easily specify different algorithms (by their execution file), environments (by configs), and seeds --- and it automatically submits a separate job for each combination.
+
+**Key features include:**
+
+- **Hydra Integration:** Configure your experiments and SLURM parameters (such as time, partition, memory, etc.) in a single Hydra config file or override them on the fly from the command line.
+- **Parallel Job Submission:** Automatically submits individual SLURM jobs for each experiment combination, enabling you to distribute your workload across multiple nodes and GPUs.
+- **Resource Flexibility:** Easily customize your SLURM resource requirements (e.g., nodes, GPUs per node, cpus per task) to best suit your hardware and research needs.
+
+**Usage Example:**
+
+To launch your experiments on a SLURM cluster with a custom time limit and partition, run:
+
+```bash
+python launcher.py slurm.time=00:30:00 slurm.partition=gpu
+```
+
+In this example, the launcher uses the provided overrides to set the job timeout to 30 minutes and submit jobs to the GPU partition. For more advanced configurations, simply adjust or add more overrides as needed.
+
+This SLURM launcher is designed to assist with research, making it simple to scale up your experiments without having to write large batch scripts manually.
+
 ## Important Considerations
 
 1. If your environment does not have a timestep limit or is not guaranteed to end through some game mechanic, then it is possible for the evaluation to seem as if it is hanging forever thereby stalling the training but in fact your agent is just so good _or bad_ that the episode never finishes. Keep this in mind if you are seeing this behaviour. One solution is to simply add a time step limit or potentially action masking.