Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch2.0.1 Rocm5.5 support #31

Open
aseok opened this issue Jun 4, 2023 · 12 comments
Open

Pytorch2.0.1 Rocm5.5 support #31

aseok opened this issue Jun 4, 2023 · 12 comments

Comments

@aseok
Copy link

aseok commented Jun 4, 2023

Hi
Will you also release this version?

@Tokoshie
Copy link

my new build : https://github.com/Tokoshie/pytorch-gfx803/releases/tag/v2.1.0a0

@brsh1
Copy link

brsh1 commented Sep 12, 2023

should it work for gfx900?

@xuhuisheng
Copy link
Owner

xuhuisheng commented Sep 12, 2023

Hi guys,
I just meet a gfx906 PCIe atomic feature problem.
AMD said either we used gfx9x GPU, the cpu and motherboard have to support PCIe atomic feature, Or pytorch 2.x will return error results.

Good news is if PCIe atomic feature had been supported, pytoch 2.x can run properly on gfx9.

So gfx900 is just well, you can use official released pytorch-2.x.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.6/

@brsh1
Copy link

brsh1 commented Sep 13, 2023

now I wonder if I use the gpu as passtrough on my xen server should I expect pcie atomic issues cause of virtualization layer?

@xuhuisheng
Copy link
Owner

@brsh1
I hadn't used gpu passthrough yet, I can only suggest you to give a try.

@brsh1
Copy link

brsh1 commented Sep 13, 2023

the reason I am asking is I have this error cant initialize nvml. when trying the version you suggested.

@brsh1
Copy link

brsh1 commented Sep 13, 2023

just to make sure what rocm version should I be using? 5.6?

@xuhuisheng
Copy link
Owner

xuhuisheng commented Sep 13, 2023

The latest ROCm-5.6 is just fine.
I also test ROCm-5.5 with pytorch-2.x, gfx906 always return invalid results, without PCIe atomic.

If you want to workaround PCIe atomic problem, my suggestion is rollback to pytorch-1.13.1. it can run SD properly.

https://download.pytorch.org/whl/rocm5.2/torch-1.13.1%2Brocm5.2-cp310-cp310-linux_x86_64.whl

@brsh1
Copy link

brsh1 commented Sep 15, 2023

should I be using: HSA_OVERRIDE_GFX_VERSION=10.3.0 for gfx900? torch.cuda.is_available reports true, but mnist example, stable diffusion both fail to run. stuck with 100% cpu until process killed by me. any ideas?

@viebrix
Copy link

viebrix commented Sep 20, 2023

@xuhuisheng many thanks for your work and description. It helped me a lot to use RX 580.

With gfx803 and rocm 5.6 I got the segmentation error in web-ui, which seems to show that torch ( v2.0.1-rc2) /vision (v0.15.2-rc2) and rocm (5.6) version does not work together. 5.5.0 worked like a charm.
which specific pytorch and vision version did you use?
see also #27 (comment)

@SLi-Man
Copy link

SLi-Man commented Oct 9, 2023

我的新版本 : https://github.com/Tokoshie/pytorch-gfx803/releases/tag/v2.1.0a0

您好,使用了您构建的PyTorch 2.1.0a0版本,可是运行Stable-Diffusion-webui还需要与torch版本对应的torchaudio,请问应该如何选择适合该PyTorch的torchaudio版本?

我测试了torchaudio-2.1.0会提示不兼容,当我尝试强制修改PyTorch 2.1.0a0的版本号为2.1.0后虽然不再提示兼容问题,但import torchaudio导入包仍会报错。

@viebrix
Copy link

viebrix commented Jan 15, 2024

@SLi-Man
I didn't change the original audio from web-ui. But here is a table with the matching versions:
https://github.com/pytorch/pytorch/wiki/PyTorch-Versions

I did also an update for newest pytorch:
https://github.com/viebrix/pytorch-gfx803/tree/main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants