Diffusion Single File
comfyui

Bad results????? Worse than SDXL finetunes?

#91
by UberMao - opened

As title says im getting worse results than SDXL when trying to use it... It also is slower than SDXL, am i doing something wrong or prompting it badly or something?

I used the same Danbooru tags for comparison. In my opinion, when dealing with a single character, there isnโ€™t much difference between the two, but when it comes to multiple characters and complex scenes, the gap is significant, with Anima clearly having the upper hand.
Regarding speed, Anima is a DiT model. Although it has 1.1B fewer parameters than SDXLโ€™s 3.5B, the ViT architecture used by DiT involves more computational steps than the U-Net used by SDXL. This gives DiT greater expressive power, which is why Anima runs slightly slower. You can use the "TorchCompileModel" node in ComfyUI to improve speed. On my B580, at a resolution of 1216x832, this boosted performance from 1.3 it/s to 2.0 it/s.
Finally, regarding your comment that the images generated by Anima are inferior to those from SDXL, I suggest you check your prompts. Where possible, use standard Danbooru tags, as non-standard tags may not be recognised.

Models like Illust-based mixes are going to have better aesthetics and a lot less responsiveness to your prompts. If you just want a sexy 1girl portrait and that's all you care about, then Anima might not be your top choice. But if you're trying to prompt a complex scene or depict characters you made up or really anything that isn't just the same tiresome 1girl portrait, then Anima can be rewarding because it's simply way smarter.

score_9 also seems to destroy the image unless you want really generic AI output looking images. I've found with Preview 2 that score_9 also really screws with the character knowledge. Try leaving it off or putting it in the negative.

My standard template is:

Prompt: masterpiece, best quality, score_10, score_9, score_8, very awa, detailed,
Negative prompt: worst quality, low quality, score_1, score_2, score_3, blurry, jpeg artifacts, sepia,

I never noticed any decrease in quality after version 2, but maybe i'll experiment with removing score_9

EDIT: just tried several images with and without score_9, and using all the same settings, the impact on the output was minimal and frankly slightly worse with score_9 removed

Thank you people i will attempt your suggestions and respond back asap

My standard template is:

Prompt: masterpiece, best quality, score_10, score_9, score_8, very awa, detailed,
Negative prompt: worst quality, low quality, score_1, score_2, score_3, blurry, jpeg artifacts, sepia,

I never noticed any decrease in quality after version 2, but maybe i'll experiment with removing score_9

EDIT: just tried several images with and without score_9, and using all the same settings, the impact on the output was minimal and frankly slightly worse with score_9 removed

Yeah, so Score_9 weights heavily the style. Like incredibly so. dropping it/neg prompting it helps with character knowledge but annihilates the quality. I had just gotten some lucky rolls I guess.

Sign up or log in to comment