-
-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a Dockerfile for AMD ROCm #3750
base: dev
Are you sure you want to change the base?
Conversation
Oops, I guess I should have opened the PR against the dev branch... |
Dockerfile.rocm
Outdated
LABEL org.opencontainers.image.licenses="AGPL-3.0" | ||
LABEL org.opencontainers.image.title="SD.Next" | ||
LABEL org.opencontainers.image.description="SD.Next: Advanced Implementation of Stable Diffusion and other Diffusion-based generative image models" | ||
LABEL org.opencontainers.image.base.name="https://hub.docker.com/pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem correct here, does it? (given that this uses ROCm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right!
You may want to add a comment at the top of the Dockerfile mentioning the |
Good point, but that's up to @vladmandic I guess. A comment in the Dockerfile or a note in the Wiki? |
a) yes, rocm overrides should be exposed as its quite a common thing. |
ok, i've pretty much rewritten https://github.com/vladmandic/sdnext/wiki/Docker so its not cuda specific |
Added Dockerfile.rocm: https://github.com/vladmandic/sdnext/blob/dev/configs/Dockerfile.rocm Went with different approach than Cuda because of flash atten. We can save 30 gb of disk space after installing flash atten in rocm-complete image and share the venv with the smaller rocm runtime image. Also using ubuntu 24 with python3.12 because onnxruntime-rocm needs python3.12. If you want to make changes, please target the new file. |
Description
Provide a Dockerfile for AMD ROCm. Finding a good image is not trivial because unlike the Torch image for CUDA, the image for ROCm is 71 GB for whatever reason.
Additionally, having a Dockerfile that "works" is a great reference for when you are trying to install something on bare metal.
Notes
Build with:
docker build -t sdnext -f Dockerfile.rocm .
Run with (example):
docker run -it --rm --device /dev/dri --group-add video -v /sdnext:/mnt -p 7860:7860 sdnext
--device /dev/dri
- that's the way to "mount" the graphics card devices into the container (instead of NVidia Toolkit)--group-add video
- the user inside the container needs access to that device-v /sdnext:/mnt
- mount a volume or a directory to keep persistent data-p 7860:7860
- publish the portThe Dockerfile is made with minimal changes from the "official" NVidia Dockerfile to minimize the difference.
The Torch image for ROCm is 71 GB for some reason, so one difference I had to make is use a smaller image with only the essentials of ROCm installed (3 GB). Torch will be installed at buildtime (~2 GB download size). Total size of the built image is 23 GB (apparently Torch is packed really well).
Environment and Testing
Tested on Debian 12 Bookworm (I had to remove the
--skip-all
option from the CMD while testing since it's currently broken in master).