Feed_forward keys

#5
by UDCAI - opened

First of all, thank you for your work and being flexible with your updates to keep the community in mind! It is greatly appreciated.

I noticed this new version adds feed_forward keys to all layers. This is clearly beneficial for the performance of the new 2602 lora, HOWEVER with the 8-step lora this causes an issue where any inference step where the sigmas are <0.500 do not converge properly, resulting in a noisy output.

This affects many user workflows. Therefore, I have tested vigorously and come to a compromise. I have uploaded a version of the lora that keeps the feed_forward keys on only main block layers 0 and 20-25. This allows the flexibility to work at sigmas <0.500 and also maintains the brightness and contrast of the full lora.

I have comparison testing you may refer to here: https://huggingface.co/UDCAI/Z-Image-Fun-Distill-ComfyUI/tree/main/Tests

Alibaba-PAI org

Do you need me to try training a LoRA without FFN again? It would have the same structure as the first version. Besides the change in where it's added, I also modified the training scheme.

Only if you have the time! Would be interesting to compare the differences, but by no means feel like you must.

Alibaba-PAI org

It seems like others have reported this issue as well. But why would this cause any inference steps with a sigma value less than 0.500 to fail to converge properly? The Euler sampler seems to work fine. Maybe I don't have a deep enough understanding of the other samplers.

If you look at the simple scheduler at default shift, there is really only one step that is spent inferencing under 0.5, which likely won't shift the final output too much. Other schedulers which spend more steps at low sigmas to help refine fine details are where you see the destructive effects of ffn, possibly an "overconvergence" type of effect where new noise is further introduced into the latent.
image

Alibaba-PAI org

Hello, which schedulers are you expecting to support? I tested exponential and karras on z image and they don't seem to work either, even without LoRA. The others appear to work fine.

It's not that these schedulers don't work, it's that the situations to use these schedulers are fairly specific, yet they see use all the same.

For instance, here is a comparison of the original 2602 8-step versus my edited lora with euler_a karras.

image

Works perfectly, The best of both worlds! I couldn't use the Lora Distill before because my workflow involves two samplers plus a face detailer, along with more advanced scheduler/samplers. Now I can.

Great job! :)

Alibaba-PAI org

It's not that these schedulers don't work, it's that the situations to use these schedulers are fairly specific, yet they see use all the same.

For instance, here is a comparison of the original 2602 8-step versus my edited lora with euler_a karras.

image

May I ask what workflow you are using? In ComfyUI's default workflow, I cannot use exponential and karras properly, regardless of whether I use LoRA or not, so I would like to try a different workflow.

Alibaba-PAI org

It seems that I have fully reproduced this issue in videox-fun. The old version of the weights has this problem. I have already made it work normally even when sigma is less than 0.5 through training. After training a bit more, I will submit the corresponding weights.

UDCAI changed discussion status to closed

Sign up or log in to comment