Skip to content

Commit

Permalink
add warnings about preemption
Browse files Browse the repository at this point in the history
  • Loading branch information
asaiacai committed Sep 18, 2024
1 parent 8ba52d3 commit 98bed97
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/admin/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ Installing the DCGM exporter is best handled using NVIDIA's `gpu-operator <https
nvidia-driver-daemonset-fvx9z 1/1 Running 0 9d
nvidia-operator-validator-62dhx 1/1 Running 0 14d
.. tip::
.. warning::

This guide currently works for on-prem bare metal deployments.
We are still validating on how to deploy :code:`nvidia-dcgm-exporter`
Expand Down
4 changes: 4 additions & 0 deletions docs/source/cloud/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,4 +58,8 @@ can be requested as.
kueue.x-k8s.io/queue-name: user-queue # this is assigned by your admin
kueue.x-k8s.io/priority-class: low-priority
.. warning::

Trainy instances are ephemeral and will be autoscaled down in 10 minutes of idling. Be sure if you are
running stateful applications like model training to instrument your application to regularly
retrieve and back up to object storage (S3, GCS, Azure Blob, Cloudflare R2, etc.)

0 comments on commit 98bed97

Please sign in to comment.