Skip to content

Commit

Permalink
Updated job resources and added mount on /dev/shm
Browse files Browse the repository at this point in the history
/dev/shm being too small was causing a crash in the DataLoader.
  • Loading branch information
knikolla committed Oct 8, 2024
1 parent a6c81c2 commit 3fad3f1
Showing 1 changed file with 12 additions and 0 deletions.
12 changes: 12 additions & 0 deletions k8s/base/job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,21 @@ spec:
volumeMounts:
- mountPath: /storage/unet3d_data
name: unet3d-data
- mountPath: /dev/shm
name: shm
resources:
requests:
memory: "16Gi"
cpu: "500m"
limits:
memory: "64Gi"
cpu: "4"
restartPolicy: Never
volumes:
- name: unet3d-data
persistentVolumeClaim:
claimName: mlperf-storage-data
- name: shm
emptyDir:
medium: Memory
backoffLimit: 4

0 comments on commit 3fad3f1

Please sign in to comment.