-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Request] Support DirectML for Windows AMD GPU users #79
Comments
I don't have access to an AMD GPU but if you are willing to be the guinea pig and test for me, I could take a crack at it. Maybe late weekend, like Sunday ish. |
Greeeattt !!! Love u !!! Yep, its good for me, I have much time! 😄 The weekend is perfect! |
I hang out in the Bark official Discord all the time, same name JonathanFly, you an DM me there. Link is here: https://github.com/suno-ai/bark |
I hope you are making progress with AMD support. I kinda need to dub my game :-) |
I make enough progress to know it's pretty tricky. But it should be easier soon, the Bark model is about to be ported to Huggingface Transformers. If you check here: |
Thanks for your efforts. i am simply too poor for a new Nvidia graphics card and stay with AMD^^ But it is a great way to give a voice to cheap NPCs in the game. |
I very badly got it working in DirectML, I'll post update soon. On my 3090 it's only a bit faster than CPU, so not sure it's gonna help much. But I do have 16 core CPU and using the DirectML version is as fast, using just one core, + the GPU. I didn't fix it I just made any torch functions that didn't work, use CPU numpy instead. |
Can you try this? https://github.com/JonathanFly/bark/tree/bark_amd_directml_test#-bark-amd-install-test- I don't know if it works on AMD, or if it does, if it's any faster than CPU. But it might be? |
Brother, you are my hero^^ It works :-) Yes, it is slow, but for me a lot faster than just CPU. I must admit that I have very little idea about Python, so thanks for the detailed tutorial. I am more of a python power user :) 2 things about the installation.
Test System: |
Wow, you are the first confirmed success that it even works on AMD. And it's faster than CPU, that's all I was hoping for! What was the error for 1., the reason that made you start it with admin? There is a bug in DirectML with memory leak. I am not sure how to deal with. Maybe just restart bark every single time from 0. |
i think it was missing write, access rights right at the beginning. if more errors come up, i'll make notes. It is definitely faster than with CPU. |
IF you get a chance, I added torch2.0 install to the readme. I can't figure out if it's supposed to WORK for AMD Windows or not. The microsoft page says NO, but it seems like some people are using it. When I tried it I got a decent 30 or 40 percent speed boost, over 1.13 DirectML. But I don't have a real AMD card so it may not work. |
torch2.0 works :-) It would be good if you could see in the shell the total time needed for the audio generation, then you can compare CPU vs torch1.0 vs torch2.0 better.
|
You can set this option in this hidden menu: But is the easiest way, type |
Thank you, that's exactly what I meant. I have not had time to deal with Bark + Infinity. I installed it yesterday only briefly to test AMD. I have to take a closer look at the whole construct in the evening. Now it makes sense to deal with it :) |
I have noticed 2 things: (torch.2.0)
C:\Users\Testwiese\bark\bark_infinity\model.py:82: UserWarning: The operator 'aten::tril.out' is currently not supported by the DML backend and will fall back to the CPU. This may have performance implications. (Triggered internally in D:\a_work\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17). --show_generation_times True does not seem to work. Neither in the GUI nor when I set it to True in config.py. |
The first thing is just a limitation, I could try rewriting the Bark code, or more likely maybe somebody already rewrote that function in a Stable Diffusion DirectML fork. But it's not a problem, it's just slower. I'll check the time display, but did you get a boost from 2.0 versus 1.13? On my NVIDIA with directl it was maybe 30 or 40 percent. |
I will do a test tonight (CET). Since the iteration display in the GUI doesn't work, I need to look at this with the prompt text, how I can always use the same text and the same speaker and then I'll test it. |
But a short quick test in the GUI (same speaker, same prompt), without iteration specification gives that: |
Yes, it's not an excessive performance gain. But I'm glad it works at all and it's definitely better than nothing :-) |
Hi Guys!! I would like use this on AMD, guys how can I use this project using DirectML for use my RX 6700XT instead CPU.
Could u help me to do the port, i could be a tester of your develop branches. Thank u very much ❤️🔥
The text was updated successfully, but these errors were encountered: