How to use with Quantized Gemma 3 12b version?

#13
by Araxyllis - opened

Thanks for the hard work!
Using a quantized version of LTX already it makes sense to also use a quantized version of gemma aswell. I am using this one, but i can not get it to work.

https://huggingface.co/unsloth/gemma-3-12b-it-bnb-4bit/tree/main

Screenshot 2026-01-11 182057

not sure if bnb is supported.
For now you can at least find some smaller ones here, with fp4 even (just 1gb more than the bnb one)
https://huggingface.co/Comfy-Org/ltx-2/tree/main/split_files/text_encoders

Nothing works. This time kijai failed.

Nothing works. This time kijai failed.

works like a charm of me.
Whats not working?

Nothing works. This time kijai failed.

With what exactly? I haven't had anything to do with the topic of Gemma3 GGUF. People have worked on the Gemma3 support for ComfyUI-GGUF for few days now and only today it was merged into the main version of the nodes.

Thanks for the hard work!
Using a quantized version of LTX already it makes sense to also use a quantized version of gemma aswell. I am using this one, but i can not get it to work.

https://huggingface.co/unsloth/gemma-3-12b-it-bnb-4bit/tree/main

Screenshot 2026-01-11 182057

u have to use the gemma loader (pipline/shards) not dualcliploader. the model shards dont have a tokenizer "transfered" into a tensor like it applies for the "stand alone" TE like gemma fp8, bf 16 etc

u can find the workflow here https://huggingface.co/Lightricks/LTX-2/discussions/20#696057578b31743540cb4112

Sign up or log in to comment