Where's the Q2_K?

#2
by Gonzaluigi - opened

We need an LTX-2 gguf model but in quantized in Q2_K, please.

We need an LTX-2 gguf model but in quantized in Q2_K, please.

let em cook, hes the first to get this out. its probably a process to compress both audio and video

Q2 is toooooo small ,Q4 OK?

Q4 is perfect. thanks!

ltx-2-19b.safetensors](https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b.safetensors)\n- ltx-2-spatial-upscaler-x2-1.0.safetensors \n\nText Encoder\n- Google Gemma 3\n\nModel Storage Location\n\n\n📂 ComfyUI/\n├── 📂 models/\n│ ├── 📂 text_encoders/\n│ │ ├── comfy_gemma_3_12B_it.safetensors\n│ ├── 📂 checkpoints/\n│ │ └── ltx-2-19b.safetensors\n│ ├── 📂 latent_upscale_models/ \n └── ltx-2-spatial-upscaler-x2-1.0.safetensors\n\n\n## Report Issues\nTo report any issues when running this workflow https://huggingface.co/MachineDelusions/LTX-2_ComfyUI_Workflows/tree/main

hey , can i run q4 on 16 gb vram , do these work https://huggingface.co/unsloth/gemma-3-12b-it-GGUF/blob/main/gemma-3-12b-it-Q2_K.gguf

Gguf text encoders for Ltx2 don't work for me yet. Tried couple nodes...

gemma3 use my other 4 bit repo ,it's run in Vram 12G

do you recommend the default work flow , will it work on my pc

Good news,KJ has separated the models .https://huggingface.co/Kijai/LTXV2_comfy

Good news,KJ has separated the models .https://huggingface.co/Kijai/LTXV2_comfy

yeah,but this repo is testing for diffuser ,not origin comfyUI workflows

I try make it run in lower Vram,maybe use mmgp or others quant method

Good news,KJ has separated the models .https://huggingface.co/Kijai/LTXV2_comfy

yeah,but this repo is testing for diffuser ,not origin comfyUI workflows

I try make it run in lower Vram,maybe use mmgp or others quant method

It's meant for comfyui like this https://cdn-uploads.huggingface.co/production/uploads/63297908f0b2fc94904a65b8/_H9tgIxAEdiIY7Lzdil3R.png

Also can we get the Q4 km variant plz?

Good news,KJ has separated the models .https://huggingface.co/Kijai/LTXV2_comfy

yeah,but this repo is testing for diffuser ,not origin comfyUI workflows

I try make it run in lower Vram,maybe use mmgp or others quant method

https://cdn-uploads.huggingface.co/production/uploads/63297908f0b2fc94904a65b8/_H9tgIxAEdiIY7Lzdil3R.png It's meant for comfyui like this

I am currently testing the diffuser pipeline of gguf(Q6_K). There are still some bugs in the pipeline code

image

Q4-K ,need sometimes

image

@smthem can I add you in Discord?

@smthem can I add you in Discord?

I see that you are from the QuantStack org. So I would recommend you to make ggufs from the weights from https://huggingface.co/Kijai/LTXV2_comfy as those are the official comfy repackaged versions without VAEs/vocoder attached (Kijai works for Comfy).

I managed to quantize and run it already. I had actually reached out to Kijai about how to load the text encoder, and he shared that exact repository with me. Regarding the VAE, I had already considered that in my case, I use a single VAE, but the node I built handles the separation automatically.

I just wanted to add you on Discord haha.

image

I managed to quantize and run it already. I had actually reached out to Kijai about how to load the text encoder, and he shared that exact repository with me. Regarding the VAE, I had already considered that in my case, I use a single VAE, but the node I built handles the separation automatically.

I just wanted to add you on Discord haha.

image

Wouldn't this make it worse for you guys to manage the ComfyUI-GGUF custom nodes since you are adding a whole seperate node only for LTX, it seems like it's better to just use the nodes already in ComfyUI GGUF. We should just use the normal nodes instead of making seperate ones only for this model. Seems counter-intuitive.

Sidenote: I am not smthem

I managed to quantize and run it already. I had actually reached out to Kijai about how to load the text encoder, and he shared that exact repository with me. Regarding the VAE, I had already considered that in my case, I use a single VAE, but the node I built handles the separation automatically.

I just wanted to add you on Discord haha.

image

Kijai shared code with me to remove the extra stuff and only keep the transformer from the LTX checkpoint

from safetensors.torch import save_file
import safetensors.torch
import torch
import json
import gc

def load_file(path):
print(f"Loading {path}...")
if not path.endswith(".safetensors"):
loaded = torch.load(path)
else:
loaded = safetensors.torch.load_file(path)
return loaded

model_type = "ltx-2-19b-distilled-fp8"

file_path = "ltx-2-19b-distilled-fp8.safetensors"
current_sd = load_file(file_path)

Load original metadata

original_metadata = {}
with safetensors.torch.safe_open(file_path, framework="pt") as f:
original_metadata = f.metadata() or {}

if "config" in original_metadata:
config = json.loads(original_metadata["config"])
if "transformer" in config or "scheduler" in config:
filtered_config = {
"transformer": config["transformer"],
"scheduler": config["scheduler"]
}
#print(json.dumps(filtered_config, indent=4))
# Update metadata with only vae config
original_metadata["config"] = json.dumps(filtered_config)
else:
print("vae not found in config")
else:
print(json.dumps(original_metadata, indent=4))

if "_quantization_metadata" in original_metadata:
del original_metadata["_quantization_metadata"]

quant_conf = {
"format": "float8_e4m3fn",
}

sd_pruned = dict()
for k, v in current_sd.items():
if k.startswith("model.") and "embeddings_connector" not in k:
if k not in sd_pruned:
sd_pruned[k] = v
if "weight_scale" in k:
sd_pruned[k.replace(".weight_scale", ".comfy_quant")] = torch.tensor(list(json.dumps(quant_conf).encode('utf-8')), dtype=torch.uint8)

Clear memory

del current_sd
gc.collect()
torch.cuda.empty_cache()

for k, v in sd_pruned.items():
if isinstance(v, torch.Tensor):
print(f"{k}: {v.shape} {v.dtype}")
else:
print(f"{k}: {type(v)}")
save_file(sd_pruned, f"{model_type}_.safetensors", metadata=original_metadata)

Hi, Actually, this is just for personal use; you could simply use Kijai's solution. This was something I already had ready for myself since yesterday because I wanted to use the model in a split format instead of the unified version that was published. However, I was stuck on the text encoder loading until I asked Kijai and he shared his repository (the one already mentioned here).

Regarding the custom node, you are correct. To run the GGUF model, you don't need to update the custom node to use LTX-2 GGUF. For now, we would only need to update it once support for Gemma 3 GGUF is added.

Hi, Actually, this is just for personal use; you could simply use Kijai's solution. This was something I already had ready for myself since yesterday because I wanted to use the model in a split format instead of the unified version that was published. However, I was stuck on the text encoder loading until I asked Kijai and he shared his repository (the one already mentioned here).

Regarding the custom node, you are correct. To run the GGUF model, you don't need to update the custom node to use LTX-2 GGUF. For now, we would only need to update it once support for Gemma 3 GGUF is added.

Nice, thanks for all the GGUF models btw, hope we see LTX 2 ggufs soon ❤️

Sign up or log in to comment