Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add gpus example #68

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions gpus/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
workspace/
1 change: 1 addition & 0 deletions gpus/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/workspace
8 changes: 8 additions & 0 deletions gpus/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM nvidia/cuda:11.6.1-base-ubuntu20.04

# Copy code
COPY . /workspace
RUN chmod +x /workspace/*.sh

# Set working directory
WORKDIR /workspace
38 changes: 38 additions & 0 deletions gpus/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# GPUs example

## Project setup

An important requirement is that you must have Docker and/or Singularity installed.

```bash
# Create Python environment and install MLCube with runners
virtualenv -p python3 ./env && source ./env/bin/activate && pip install mlcube-docker mlcube-singularity
# Fetch the gpus example from GitHub
git clone https://github.com/mlcommons/mlcube_examples && cd ./mlcube_examples
git fetch origin pull/68/head:feature/gpu_example && git checkout feature/gpu_example
cd ./gpus/
```

## MLCube tasks

There is only one taks that will output the variable `CUDA_VISIBLE_DEVICES` along with the ouput of the `nvidia-smi` command:

```shell
mlcube run --task=check_gpus
```

You can modify the number of gpus by editing the number of `accelerator_count` inside the **mlcube.yaml** file.

Also you can override the number of gpus to use by using the `--gpus` flag when running the command, example:

```shell
mlcube run --task=check_gpus --gpus=2
```

### Singularity

For running on Singularity, you can define the platform while running the command as follows:

```shell
mlcube run --task=check_gpus --platform=singularity
```
19 changes: 19 additions & 0 deletions gpus/check_gpus.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash

LOG_DIR=${LOG_DIR:-"/"}

# Handle MLCube parameters
while [ $# -gt 0 ]; do
case "$1" in
--log_dir=*)
LOG_DIR="${1#*=}"
;;
*) ;;
esac
shift
done

echo "CUDA_VISIBLE_DEVICES $CUDA_VISIBLE_DEVICES" |& tee "$LOG_DIR/gpus.log"
echo "NVIDIA_VISIBLE_DEVICES $NVIDIA_VISIBLE_DEVICES" |& tee "$LOG_DIR/gpus.log"
nvidia-smi |& tee -a "$LOG_DIR/gpus.log"
nvidia-smi --query-gpu=gpu_name,uuid --format=csv |& tee -a "$LOG_DIR/gpus.log"
24 changes: 24 additions & 0 deletions gpus/mlcube.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: check_gpus
description: Check gpus example
authors:
- { name: "MLCommons Best Practices Working Group" }

platform:
accelerator_count: 1

docker:
# Image name.
image: dfjbtest/gpus_example:0.0.1
# Docker build context relative to $MLCUBE_ROOT. Default is `build`.
build_context: "./"
# Docker file name within docker build context, default is `Dockerfile`.
build_file: "Dockerfile"
# GPU arguments
gpu_args: "--gpus=1"

tasks:
check_gpus:
entrypoint: ./check_gpus.sh
parameters:
outputs:
log_dir: logs/