Solving Performance Issues. #4457
Replies: 17 comments 25 replies
-
Hi, I did several tests with clean installation and perfectly configured env. Only with Flux did I notice a deterioration in performance. Out of curiosity I disabled xformers and used Pytorch Cross attention expecting a total collapse in performance but instead the speed turned out to be the same. They improved pytorch cross attention, is xformers necessary? It could be that the acceleration doesn't start which instead happens on Forge. I repeat, everything is correctly configured, no errors in the installation....I repeat...perfect installation...did I already say that the installation is ok? I would never want to...Thank you.. xformers 0.0.27 pytorch 2.3.1 |
Beta Was this translation helpful? Give feedback.
-
I found the workflow loading time is higher compared to old version. And still there are issue with other not conflict with new UI |
Beta Was this translation helpful? Give feedback.
-
What I can say is that I (RTX 2060 6 GB, 32 GB RAM, Windows 11) get vastly better performance on SD Forge with Flux Dev compared to Comfy (using the recommended standalone build). Around 11s/it vs 7 s/it using the exact same settings (1024x1024, 20 steps, Euler) in FP16 with T5 XXL at fp8. This is strange because I was under the impression both were using similar engines under the hood. |
Beta Was this translation helpful? Give feedback.
-
Since the update yesterday, once in while I get an out of memory (I just need to press the Queue Prompt button again and it works) or losing connection while it's rendering (Need to restart server). Before the update I didn't have does issues. I'm not using --highvram setting and my pytorch version is 2.3.1 The only suspect message in the command line is: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable added: My arguments are: set COMMANDLINE_ARGS= --cuda-malloc --no-half-vae --cuda-device 0 --use-pytorch-cross-attention |
Beta Was this translation helpful? Give feedback.
-
Should have left well alone. |
Beta Was this translation helpful? Give feedback.
-
GTX 1070 ( 8GB RAM ) In my case, Flux fp8 got slower, but Flux f16 got faster. This involves at least these 2 commits:
After the "fp16 support hack", both Flux fp8 and Flux f16 got slower. But more recently, Flux fp16 got faster, ( even faster than before the "fp16 support hack" ). EDIT: After the latest update, there was an improvement, for fp16 |
Beta Was this translation helpful? Give feedback.
-
After the latest updates the situation has improved significantly. However, there is a problem after generation: the VRAM is sometimes not emptied resulting in the possibility of OOM, but it doesn't always happen. Otherwise the speed has improved significantly. Keep optimizing and soon Forge will be behind you again :) FLUX fp8. |
Beta Was this translation helpful? Give feedback.
-
please fix the MPS mac m2 silicone issue!!! Requested to load AutoencoderKL |
Beta Was this translation helpful? Give feedback.
-
On Windows using comfyui. How do I either downgrade the embedded version of Pytorch to 2.3.1, or if I use a newly installed version of comfyui - stop it from going to 2.40? |
Beta Was this translation helpful? Give feedback.
-
2 days ago I sometimes had OOM, now since today's update my first render is good, my second is very noisy, and all others are only noise. I need to restart Comfyui to go back to be able to generate only one good render. This is using Lora, without Lora it works. |
Beta Was this translation helpful? Give feedback.
-
I seldomly but do sometimes have vram issues with my 4090, but with he right res is never an issue. Some nodes or workflows are just really bad ramping up vram for very little. Yes 2.4 was a massive issue to run anything on. |
Beta Was this translation helpful? Give feedback.
-
On ubuntu 22.04 with cuda 12.4 comfyUI keeps crashing and failing to load ip adapters models. instead of: This part of the README wasn't updated in the last 3 month so i wonder if you still recommend installing the --pre torch ? @comfyanonymous |
Beta Was this translation helpful? Give feedback.
-
For nvidia 10xx series cards. I did several tests and ended up with the following FLAGs to optimize performance: with -use-pytorch-cross-attention or xformers you can't use fp16, why? |
Beta Was this translation helpful? Give feedback.
-
I have exact same speed with ComfyUI and forge |
Beta Was this translation helpful? Give feedback.
-
I now recently not long after I did my previous post started getting perpetual VRAM issues, the issue being is that I think it does not unload VRAM and then starts another on top of it making my computer lag and putting nothing out because it jams up my whole system. It runs fine the first time, but if queue stops and restarts or after a couple it might result in this issue. |
Beta Was this translation helpful? Give feedback.
-
Hi, I have dual boot disks with Windows 11 and Ubuntu 22.04, RTX3090 and 32GB RAM. Flux Dev 1 works with text encoder fp16 and fp8 in ComfyUI in Windows only, in Ubuntu only works fp8, it causes the system to crash completely, having to do a hard reboot. Have tried the --reserve-vram but failed. Any ideas why the fp16 only work in Windows? On WebUi Forge I can get it to work in Ubuntu, setting the GPU reserve to 12000 and Swap to CPU. Really wanted to use ComfyUI in Ubuntu :( |
Beta Was this translation helpful? Give feedback.
-
Hello, Latest version is super unstable on my config on fresh install : Total VRAM 24563 MB, total RAM 32691 MB waiting for another update. edit : GPU 3090 Ti OC + 32Go RAM + NVME + Ryzen 3950x 32 thread work smoothly on ; git checkout 14af129 |
Beta Was this translation helpful? Give feedback.
-
There has been a number of big changes to the ComfyUI core recently which should improve performance across the board but there might still be some bugs that slow things down for some people and I want to find and fix them before the next stable release.
If you have performance issues:
For Windows Users:
If you still have performance issues, report them in this thread, make sure to post your full ComfyUI log and your workflow. The more information the better.
Some common sources of user errors:
Beta Was this translation helpful? Give feedback.
All reactions