Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated GPU instructions for 535 driver release. #482

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 20 additions & 18 deletions source/compute/gpu-support.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,25 +19,26 @@ GPUs. The slice size provided is "GRID A100D-20C", which provides
Minimum Requirements
====================

For "c2-gpu", the absolute minimum requirements are as follows:
For "c2-gpu" the requirements are as follows:

* A boot/OS disk of at least 30GB (when installing CUDA support)
* NVIDIA vGPU driver from the v15.0 series. This is currently version
525.60.12.
* A boot/OS disk of at least 30GB (when installing CUDA support).
* NVIDIA vGPU driver release 525 or 535.

The version of the driver loaded into your virtual server **must** be
exactly this version, and not any other. From time to time we will
update the version needed, and inform you when this updated will be
required on your virtual servers.
a supported version; vGPU release 535 is recommended for full
functionality. The older 525 driver will still work but customers
using this version are recommended to upgrade.

Driver release 535 supports CUDA toolkit v12.1.

.. note::

Drivers provided by OS or distribution vendors should not be
installed. Only the drivers specified here will function with
Drivers provided by OS or distribution vendors should **not** be
installed. Only the vGPU drivers specified here will function with
the vGPUs available.

In addition, NVIDIA support only the following server operating
systems for your vGPU virtual server while running in Catalyst Cloud:
systems for vGPU virtual servers while running in Catalyst Cloud:

* Ubuntu 22.04, 20.04

Expand All @@ -59,7 +60,7 @@ so you will need to install supporting drivers to enable GPU support in
GPU-enabled virtual servers as per the instructions below.

To help with streamlining GPU server builds we've :ref:`provided examples on
using Packer to build custom images that include GPU drivers and software<packer-tutorial-gpu>`.
using Packer to build custom images that include GPU drivers and software <packer-tutorial-gpu>`.
This process is recommended for bulk GPU compute deployments.

Ubuntu
Expand All @@ -82,7 +83,7 @@ Then download and install the GRID driver package.
.. code-block:: bash

sudo apt install -y dkms
curl -O https://object-storage.nz-por-1.catalystcloud.io/v1/AUTH_483553c6e156487eaeefd63a5669151d/gpu-guest-drivers/nvidia/grid/15.0/linux/nvidia-linux-grid-525_525.60.13_amd64.deb
curl -O https://object-storage.nz-por-1.catalystcloud.io/v1/AUTH_483553c6e156487eaeefd63a5669151d/gpu-guest-drivers/nvidia/grid/16.3/linux/nvidia-linux-grid-535_535.154.05_amd64.deb
sudo dpkg -i nvidia-linux-grid-525_525.60.13_amd64.deb

.. note::
Expand Down Expand Up @@ -123,8 +124,8 @@ the future.

.. code-block:: bash

curl -O https://developer.download.nvidia.com/compute/cuda/12.0.0/local_installers/cuda_12.0.0_525.60.13_linux.run
sudo sh cuda_12.0.0_525.60.13_linux.run --silent --toolkit
curl -O https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda_12.1.0_530.30.02_linux.run
sudo sh cuda_12.1.0_530.30.02_linux.run --silent --toolkit

This will run without any visible output for a while, before returning
to a command prompt.
Expand Down Expand Up @@ -181,8 +182,9 @@ Then install the GRID driver package:

.. code-block:: bash

curl -O https://object-storage.nz-por-1.catalystcloud.io/v1/AUTH_483553c6e156487eaeefd63a5669151d/gpu-guest-drivers/nvidia/grid/15.0/linux/NVIDIA-Linux-x86_64-525.60.13-grid.run
sudo sh NVIDIA-Linux-x86_64-525.60.13-grid.run -s -Z
curl -O https://object-storage.nz-por-1.catalystcloud.io/v1/AUTH_483553c6e156487eaeefd63a5669151d/gpu-guest-drivers/nvidia/grid/16.3/linux/NVIDIA-Linux-x86_64-535.154.05-grid.run

sudo sh NVIDIA-Linux-x86_64-535.154.05-grid.run -s -Z

.. note::

Expand Down Expand Up @@ -229,8 +231,8 @@ the future.

.. code-block:: bash

curl -O https://developer.download.nvidia.com/compute/cuda/12.0.0/local_installers/cuda_12.0.0_525.60.13_linux.run
sudo sh cuda_12.0.0_525.60.13_linux.run --silent --toolkit
curl -O https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda_12.1.0_530.30.02_linux.run
sudo sh cuda_12.1.0_530.30.02_linux.run --silent --toolkit

This will run without any visible output for a while, before returning
to a command prompt.
Expand Down
14 changes: 7 additions & 7 deletions source/tutorials/images/using-packer-to-build-cuda-images.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ To complete this tutorial you will need the following:
inbound SSH access.
* :ref:`Openrc file<source-rc-file>` for your Catalyst Cloud project sourced
in your shell.
* `Packer must be installed <https://developer.hashicorp.com/packer/downloads>`_
* `Packer must be installed <https://developer.hashicorp.com/packer/install>`_
on your system.
* Sufficient quota capacity in your cloud project for Packer to create the
temporary resources required to build the image.
Expand Down Expand Up @@ -124,17 +124,17 @@ Image build process
"set -e",
"sudo apt update",
"sudo apt install -y dkms",
"curl -O https://object-storage.nz-por-1.catalystcloud.io/v1/AUTH_483553c6e156487eaeefd63a5669151d/gpu-guest-drivers/nvidia/grid/15.0/linux/nvidia-linux-grid-525_525.60.13_amd64.deb",
"sudo dpkg -i nvidia-linux-grid-525_525.60.13_amd64.deb",
"rm -f nvidia-linux-grid-525_525.60.13_amd64.deb",
"curl -O https://object-storage.nz-por-1.catalystcloud.io/v1/AUTH_483553c6e156487eaeefd63a5669151d/gpu-guest-drivers/nvidia/grid/16.3/linux/nvidia-linux-grid-535_535.154.05_amd64.deb",
"sudo dpkg -i nvidia-linux-grid-535_535.154.05_amd64.deb",
"rm -f nvidia-linux-grid-535_535.154.05_amd64.deb",
"sudo mkdir -p /etc/nvidia/ClientConfigToken",
"(cd /etc/nvidia/ClientConfigToken && sudo curl -O https://object-storage.nz-por-1.catalystcloud.io/v1/AUTH_483553c6e156487eaeefd63a5669151d/gpu-guest-drivers/nvidia/grid/licenses/client_configuration_token_12-29-2022-15-20-23.tok)",
"sudo sed -i -e '/^\\\(FeatureType=\\\).*/{s//\\\11/;:a;n;ba;q}' -e '\$aFeatureType=1' /etc/nvidia/gridd.conf",
"sudo systemctl restart nvidia-gridd",
"curl -O https://developer.download.nvidia.com/compute/cuda/12.0.0/local_installers/cuda_12.0.0_525.60.13_linux.run",
"curl -O https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda_12.1.0_530.30.02_linux.run",
"echo 'Installing CUDA. This may take a few minutes...'",
"sudo sh cuda_12.0.0_525.60.13_linux.run --silent --toolkit",
"rm -f cuda_12.0.0_525.60.13_linux.run",
"sudo sh cuda_12.1.0_530.30.02_linux.run --silent --toolkit",
"rm -f cuda_12.1.0_530.30.02_linux.run",
"echo /usr/local/cuda/lib64 | sudo tee /etc/ld.so.conf.d/cuda.conf",
"sudo ldconfig",
"sudo systemctl stop cloud-init",
Expand Down