Skip to content

Commit

Permalink
Fix: running job's from user and with started SLURM from non-root user
Browse files Browse the repository at this point in the history
  • Loading branch information
karasevb committed Jun 17, 2016
1 parent ec4138b commit 3b6d4d8
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 11 deletions.
28 changes: 17 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,29 +17,35 @@ Prerequisites: `screen` tool.
- Change `/etc/lxc/dnsmasq.conf` adding following line:
* `conf-file=$SLXC_PATH/build/dnsmasq.conf`
3. Install **Munge** in `MUNGE_PATH` (under `someuser`). **NOTE!** that *munge-0.5.11* has problems with user-defined prefix installation (see https://code.google.com/p/munge/issues/detail?id=34 for the details). In the mentioned issue report you may find the patch that temporally fixes this problem. Or you can use more recent versions that have this problem fixed.
4. Install **SLURM** in `SLURM_PATH` (under `someuser`). Make additional directorys in slurm's prefix:
4. [Optional] If the **SLURM_USER** is not root and you plan to submit jobs as user **USER1** != **SLURM_USER**:
- Apply the patch from SLURM directory:
* `patch -p1 < <slxc_path>/patch/start_from_user.patch`
5. Install **SLURM** in `SLURM_PATH` (under `someuser`). Make additional directorys in slurm's prefix:
- `mkdir $SLURM_PATH/var $SLURM_PATH/etc`
5. Configure SLURM and put its configuration in `$SLURM_PATH/etc/slurm.conf`. While configuring select your favorite domain names for the frontend and compute nodes. Here we will use `frontend` and `cnX`.
6. Put SLURM and Munge installation paths to `$SLXC_PATH/slxc.conf`.
7. Set `SLURM_USER` to `someuser` in `$SLXC_PATH/slxc.conf`.
8. Create cluster machines with `slxc-new-node.sh`. The only argument of `slxc-new-node.sh` is machine hostname. NOTE that you must use the same frontend/compute nodes names as in `$SLURM_PATH/etc/slurm.conf`.
6. Configure SLURM and put its configuration in `$SLURM_PATH/etc/slurm.conf`. While configuring select your favorite domain names for the frontend and compute nodes. Here we will use `frontend` and `cnX`.
7. Put SLURM and Munge installation paths to `$SLXC_PATH/slxc.conf`.
8. Set `SLURM_USER` to `someuser` in `$SLXC_PATH/slxc.conf`.
9. Create cluster machines with `slxc-new-node.sh`. The only argument of `slxc-new-node.sh` is machine hostname. NOTE that you must use the same frontend/compute nodes names as in `$SLURM_PATH/etc/slurm.conf`.
- Create frontend first (let's call it "frontend" for example ):
* `$SLXC_PATH/slxc-new-node.sh frontend`
- Create node machines (cn1, cn2, ..., cnN):
* `$ for i in $(seq 1 N); do $SLX_PATH/slxc-new-node.sh cn$i; done`
9. [Optional] Add Munge and SLURM installation paths to your PATH environment variable.
10. [Optional] Add Munge and SLURM installation paths to your PATH environment variable.
And `export SLURM_CONF=$SLURM_PATH/etc/slurm.conf` to let `sinfo`, `sbatch`
and others know how to reach `slurmctld`.
10. Restart lxc-net service (for Ubuntu/Mint):
11. Restart lxc-net service (for Ubuntu/Mint):
- `$ sudo service lxc-net restart`
11. Start your cluster:
12. [Optional] If the **SLURM_USER** is not root and you plan to submit jobs as user **USER1** != **SLURM_USER**:
- Setup SLURM capabilities:
* `$ sudo ./slurm-set-capabilities.sh`
13. Start your cluster:
- `$ sudo ./slxc-run-cluster.sh`
12. Verify that everything is OK (both tools should show all your virtual "machines" running):
14. Verify that everything is OK (both tools should show all your virtual "machines" running):
- `$ sudo screen -ls`
- `$ sudo lxc-ls --active`
13. Now you can attach to any machine with
15. Now you can attach to any machine with
- `$ sudo lxc-attach -n $nodename`
14. To shutdown your cluster use
16. To shutdown your cluster use
- `$ ./slxc-stop-cluster.sh`
- NOTE: that it may take a while. You can speedup this process by setting
`LXC_SHUTDOWN_TIMEOUT` in `/etc/default/lxc` (for Ubuntu and Mint)
Expand Down
15 changes: 15 additions & 0 deletions patch/start_from_user.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
diff --git a/src/slurmd/slurmstepd/mgr.c b/src/slurmd/slurmstepd/mgr.c
index c7eeb29..cd2a528 100644
--- a/src/slurmd/slurmstepd/mgr.c
+++ b/src/slurmd/slurmstepd/mgr.c
@@ -2411,8 +2411,8 @@ _drop_privileges(stepd_step_rec_t *job, bool do_setuid,
/*
* No need to drop privileges if we're not running as root
*/
- if (getuid() != (uid_t) 0)
- return SLURM_SUCCESS;
+ //if (getuid() != (uid_t) 0)
+ // return SLURM_SUCCESS;

if (setegid(job->gid) < 0) {
error("setegid: %m");
8 changes: 8 additions & 0 deletions slurm-set-capabilities.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

. ./slxc.conf

CAPLIST="cap_chown,cap_setgid,cap_setuid,cap_fowner+ep"

setcap $CAPLIST $SLURM_PATH/sbin/slurmstepd
getcap $SLURM_PATH/sbin/slurmstepd

0 comments on commit 3b6d4d8

Please sign in to comment.