Any methods to increase the efficiency?

#303
by logy2333 - opened

First of all, my whole heart of gratitude for Phr00t's masterpiece in AIO projects. I'm not only enjoying Qwen AIO but also other of your works!
Here's my question:
I am now running model in ComfyUI with 32GB RAM, 12th I5 CPU and an AMD 9600XT 16GB card. I find the new AIO models generates images slower than the versions before v14. Before that, single step of image generation will cost around 40 secs, but now it's 80+ secs, and for a complete image generation, it will be around 6 min. All scheduler and sampler settings are according to the README card, and running with pytorch attention.
Is there any method to increase the generating speed, such as changing the sampler params or use different attentions?

What resolution do you use? 1024x1024 on my 5090 is about 10 seconds.

What resolution do you use? 1024x1024 on my 5090 is about 10 seconds.

much smaller, only 512*512

The normal thing would be to have ComfyUI running with xformers attention, rather than PyTorch attention, the latter being several percent slower.

However, taking 6 minutes for an image generation makes it seem like there is some major problem there.

I'm using an RTX3090 with 24GB VRAM and did not notice any slowdown, e.g. with going from the v11.3 AIO model which I used extensively, to the new v23, which I am liking very much. It tales only seconds to generate an image at 1280x1280.

Are your AMD drivers up to date and installed correctly ?

Are you sure your ComfyUI installation is using the AMD graphics card and not your CPU/iGPU ?

When installing ComfyUI, did you definitely download/select the AMD graphics version/option ?

The normal thing would be to have ComfyUI running with xformers attention, rather than PyTorch attention, the latter being several percent slower.

However, taking 6 minutes for an image generation makes it seem like there is some major problem there.

I'm using an RTX3090 with 24GB VRAM and did not notice any slowdown, e.g. with going from the v11.3 AIO model which I used extensively, to the new v23, which I am liking very much. It tales only seconds to generate an image at 1280x1280.

Are your AMD drivers up to date and installed correctly ?

Are you sure your ComfyUI installation is using the AMD graphics card and not your CPU/iGPU ?

When installing ComfyUI, did you definitely download/select the AMD graphics version/option ?

Thanks for your advice!
First, to answer your questions. Yes. ComfyUI and graphic drivers are correctly installed and used.
And, I think I have found 1 possible cause for slowdown. I now manually save 1 GB VRAM, and a 4-step image cost now reduce to about 1 minutes. I think it's the latest version of either ComfyUI or graphic driver that offers a more conservative scheme for automatic VRAM reservation.
Anyway, for those who have met similar problem, especially with AMD card, try to set VRAM reservation manually.

The normal thing would be to have ComfyUI running with xformers attention, rather than PyTorch attention, the latter being several percent slower.

However, taking 6 minutes for an image generation makes it seem like there is some major problem there.

I'm using an RTX3090 with 24GB VRAM and did not notice any slowdown, e.g. with going from the v11.3 AIO model which I used extensively, to the new v23, which I am liking very much. It tales only seconds to generate an image at 1280x1280.

Are your AMD drivers up to date and installed correctly ?

Are you sure your ComfyUI installation is using the AMD graphics card and not your CPU/iGPU ?

When installing ComfyUI, did you definitely download/select the AMD graphics version/option ?

Thanks for your advice!
First, to answer your questions. Yes. ComfyUI and graphic drivers are correctly installed and used.
And, I think I have found 1 possible cause for slowdown. I now manually save 1 GB VRAM, and a 4-step image cost now reduce to about 1 minutes. I think it's the latest version of either ComfyUI or graphic driver that offers a more conservative scheme for automatic VRAM reservation.
Anyway, for those who have met similar problem, especially with AMD card, try to set VRAM reservation manually.

Bro this is definitely some AMD issue or your comfyUI issue because I have the exact same specs but instead of 9600 XT 16GB, I have RTX 3060 12GB - 1 step in 768 x 1152 takes 10.0 - 10.4 secs for me... I usually do 4 steps editing with 2 reference images.. so it takes ~42 seconds in total usually...

If I only use 1 reference image at 768 x 1152 - It takes like ~6.2 secs per step for me..

Though I use Sage Attention 2 and It doesn't work on AMD cards, only Sage Attention v1 works on AMD.

The normal thing would be to have ComfyUI running with xformers attention, rather than PyTorch attention, the latter being several percent slower.

However, taking 6 minutes for an image generation makes it seem like there is some major problem there.

I'm using an RTX3090 with 24GB VRAM and did not notice any slowdown, e.g. with going from the v11.3 AIO model which I used extensively, to the new v23, which I am liking very much. It tales only seconds to generate an image at 1280x1280.

Are your AMD drivers up to date and installed correctly ?

Are you sure your ComfyUI installation is using the AMD graphics card and not your CPU/iGPU ?

When installing ComfyUI, did you definitely download/select the AMD graphics version/option ?

Thanks for your advice!
First, to answer your questions. Yes. ComfyUI and graphic drivers are correctly installed and used.
And, I think I have found 1 possible cause for slowdown. I now manually save 1 GB VRAM, and a 4-step image cost now reduce to about 1 minutes. I think it's the latest version of either ComfyUI or graphic driver that offers a more conservative scheme for automatic VRAM reservation.
Anyway, for those who have met similar problem, especially with AMD card, try to set VRAM reservation manually.

Bro this is definitely some AMD issue or your comfyUI issue because I have the exact same specs but instead of 9600 XT 16GB, I have RTX 3060 12GB - 1 step in 768 x 1152 takes 10.0 - 10.4 secs for me... I usually do 4 steps editing with 2 reference images.. so it takes ~42 seconds in total usually...

If I only use 1 reference image at 768 x 1152 - It takes like ~6.2 secs per step for me..

Though I use Sage Attention 2 and It doesn't work on AMD cards, only Sage Attention v1 works on AMD.

Indeed. ComfyUI is still not so powerful on AMD platform. But still, after changing attention and VRAM reservation, the current speed can be a satisfying result for me.

Sign up or log in to comment