gemma3 4bit support
this space uses 4 bit gemma. very efficient. I tried severial results T2V and I2V both very good.
Can you migrate it to comfyui ? it will be so cool, comfyui seems to have no 4 bit clip support.
https://huggingface.co/spaces/alexnasa/ltx-2-TURBO
unsloth/gemma-3-12b-it-qat-bnb-4bit
For now, this one works (fp8), quite a lot smaller than the default one
https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/blob/main/gemma_3_12B_it_fp8_e4m3fn.safetensors
4bit is 1/2 size than fp8. i only use 4 bit or GGUF q4, good enough
For now, this one works (fp8), quite a lot smaller than the default one
https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/blob/main/gemma_3_12B_it_fp8_e4m3fn.safetensors
That may be true for some setups, but that's not true across the board. For example, I do not think e4m3fn will work with the Ampere (RTX3090) that I have.
That may be true for some setups, but that's not true across the board. For example, I do not think e4m3fn will work with the Ampere (RTX3090) that I have.
I also have 3090 and works fine ;-) For torch compile / triton you need a later version were support for this was also added for the 3xxx series
fp4 onlys works on 50** card . Good news , we finally got gguf , smaller than int4
fp4 onlys works on 50** card . Good news , we finally got gguf , smaller than int4
I have a 4090 and it works fine for me. You need to make sure to install the comfy-kitchen dependency that Comfy added this week. It adds general NVFP4 kernels. The GGUFs still require a PR I believe.
That may be true for some setups, but that's not true across the board. For example, I do not think e4m3fn will work with the Ampere (RTX3090) that I have.
I also have 3090 and works fine ;-) For torch compile / triton you need a later version were support for this was also added for the 3xxx series
How much memory does it use? Do you still get the benefit of reduced vram usage?
How much memory does it use? Do you still get the benefit of reduced vram usage?
Think its a bit of a "software emulation" since hardware support is on later GPU's, but not noticing it being significantly slower than other, or using much more ram.
Most my later models are e4m3fn, and those uploading models also seem to just do e4m3fn by now, since the support goes way back to even 3xxx series
4090 can use fp4? however the fp4 is not 1/2 fp8 size,still very big. gguf much better😄