Jan 7

We need an LTX-2 gguf model but in quantized in Q2_K, please.

realrebelai

Jan 7

We need an LTX-2 gguf model but in quantized in Q2_K, please.

let em cook, hes the first to get this out. its probably a process to compress both audio and video

smthem

Owner Jan 8

Q2 is toooooo small ,Q4 OK?

Gonzaluigi

Jan 8

Ok

ychristian008

Jan 8

Q4 is perfect. thanks!

niuniu66666

Jan 8

ltx-2-19b.safetensors](https://huggingface.co/Lightricks/LTX-2/resolve/main/ltx-2-19b.safetensors)\n- ltx-2-spatial-upscaler-x2-1.0.safetensors \n\nText Encoder\n- Google Gemma 3\n\nModel Storage Location\n\n\n📂 ComfyUI/\n├── 📂 models/\n│ ├── 📂 text_encoders/\n│ │ ├── comfy_gemma_3_12B_it.safetensors\n│ ├── 📂 checkpoints/\n│ │ └── ltx-2-19b.safetensors\n│ ├── 📂 latent_upscale_models/ \n └── ltx-2-spatial-upscaler-x2-1.0.safetensors\n\n\n## Report Issues\nTo report any issues when running this workflow https://huggingface.co/MachineDelusions/LTX-2_ComfyUI_Workflows/tree/main

DEADMAN3009

Jan 8

hey , can i run q4 on 16 gb vram , do these work https://huggingface.co/unsloth/gemma-3-12b-it-GGUF/blob/main/gemma-3-12b-it-Q2_K.gguf

tech77

Jan 8

•

edited Jan 8

hey , can i run q4 on 16 gb vram , do these work https://huggingface.co/unsloth/gemma-3-12b-it-GGUF/blob/main/gemma-3-12b-it-Q2_K.gguf

Gguf text encoders for Ltx2 don't work for me yet. Tried couple nodes...

smthem

Owner Jan 8

gemma3 use my other 4 bit repo ，it's run in Vram 12G

DEADMAN3009

Jan 8

do you recommend the default work flow , will it work on my pc

niuniu66666

Jan 8

Good news，KJ has separated the models .https://huggingface.co/Kijai/LTXV2_comfy

smthem

Owner Jan 8

Good news，KJ has separated the models .https://huggingface.co/Kijai/LTXV2_comfy

yeah，but this repo is testing for diffuser ,not origin comfyUI workflows

I try make it run in lower Vram,maybe use mmgp or others quant method

natalie5

Jan 8

•

edited Jan 8

Good news，KJ has separated the models .https://huggingface.co/Kijai/LTXV2_comfy

yeah，but this repo is testing for diffuser ,not origin comfyUI workflows

I try make it run in lower Vram,maybe use mmgp or others quant method

It's meant for comfyui like this https://cdn-uploads.huggingface.co/production/uploads/63297908f0b2fc94904a65b8/_H9tgIxAEdiIY7Lzdil3R.png

Also can we get the Q4 km variant plz?

niuniu66666

Jan 8

Good news，KJ has separated the models .https://huggingface.co/Kijai/LTXV2_comfy

yeah，but this repo is testing for diffuser ,not origin comfyUI workflows

I try make it run in lower Vram,maybe use mmgp or others quant method

https://cdn-uploads.huggingface.co/production/uploads/63297908f0b2fc94904a65b8/_H9tgIxAEdiIY7Lzdil3R.png It's meant for comfyui like this

smthem

Owner Jan 8

I am currently testing the diffuser pipeline of gguf（Q6_K）. There are still some bugs in the pipeline code

Q4-K ,need sometimes

YarvixPA

Jan 8

@smthem can I add you in Discord?

natalie5

Jan 8

@smthem can I add you in Discord?

I see that you are from the QuantStack org. So I would recommend you to make ggufs from the weights from https://huggingface.co/Kijai/LTXV2_comfy as those are the official comfy repackaged versions without VAEs/vocoder attached (Kijai works for Comfy).

YarvixPA

Jan 8

I managed to quantize and run it already. I had actually reached out to Kijai about how to load the text encoder, and he shared that exact repository with me. Regarding the VAE, I had already considered that in my case, I use a single VAE, but the node I built handles the separation automatically.

I just wanted to add you on Discord haha.

natalie5

Jan 8

•

edited Jan 8

I managed to quantize and run it already. I had actually reached out to Kijai about how to load the text encoder, and he shared that exact repository with me. Regarding the VAE, I had already considered that in my case, I use a single VAE, but the node I built handles the separation automatically.

I just wanted to add you on Discord haha.

Wouldn't this make it worse for you guys to manage the ComfyUI-GGUF custom nodes since you are adding a whole seperate node only for LTX, it seems like it's better to just use the nodes already in ComfyUI GGUF. We should just use the normal nodes instead of making seperate ones only for this model. Seems counter-intuitive.

Sidenote: I am not smthem

natalie5

Jan 8

•

edited Jan 8

I managed to quantize and run it already. I had actually reached out to Kijai about how to load the text encoder, and he shared that exact repository with me. Regarding the VAE, I had already considered that in my case, I use a single VAE, but the node I built handles the separation automatically.

I just wanted to add you on Discord haha.

Kijai shared code with me to remove the extra stuff and only keep the transformer from the LTX checkpoint

from safetensors.torch import save_file
import safetensors.torch
import torch
import json
import gc

def load_file(path):
print(f"Loading {path}...")
if not path.endswith(".safetensors"):
loaded = torch.load(path)
else:
loaded = safetensors.torch.load_file(path)
return loaded

model_type = "ltx-2-19b-distilled-fp8"

file_path = "ltx-2-19b-distilled-fp8.safetensors"
current_sd = load_file(file_path)

Load original metadata

original_metadata = {}
with safetensors.torch.safe_open(file_path, framework="pt") as f:
original_metadata = f.metadata() or {}

if "config" in original_metadata:
config = json.loads(original_metadata["config"])
if "transformer" in config or "scheduler" in config:
filtered_config = {
"transformer": config["transformer"],
"scheduler": config["scheduler"]
}
#print(json.dumps(filtered_config, indent=4))
# Update metadata with only vae config
original_metadata["config"] = json.dumps(filtered_config)
else:
print("vae not found in config")
else:
print(json.dumps(original_metadata, indent=4))

if "_quantization_metadata" in original_metadata:
del original_metadata["_quantization_metadata"]

quant_conf = {
"format": "float8_e4m3fn",
}

sd_pruned = dict()
for k, v in current_sd.items():
if k.startswith("model.") and "embeddings_connector" not in k:
if k not in sd_pruned:
sd_pruned[k] = v
if "weight_scale" in k:
sd_pruned[k.replace(".weight_scale", ".comfy_quant")] = torch.tensor(list(json.dumps(quant_conf).encode('utf-8')), dtype=torch.uint8)

Clear memory

del current_sd
gc.collect()
torch.cuda.empty_cache()

for k, v in sd_pruned.items():
if isinstance(v, torch.Tensor):
print(f"{k}: {v.shape} {v.dtype}")
else:
print(f"{k}: {type(v)}")
save_file(sd_pruned, f"{model_type}_.safetensors", metadata=original_metadata)

YarvixPA

Jan 8

Hi, Actually, this is just for personal use; you could simply use Kijai's solution. This was something I already had ready for myself since yesterday because I wanted to use the model in a split format instead of the unified version that was published. However, I was stuck on the text encoder loading until I asked Kijai and he shared his repository (the one already mentioned here).

Regarding the custom node, you are correct. To run the GGUF model, you don't need to update the custom node to use LTX-2 GGUF. For now, we would only need to update it once support for Gemma 3 GGUF is added.

natalie5

Jan 8

Hi, Actually, this is just for personal use; you could simply use Kijai's solution. This was something I already had ready for myself since yesterday because I wanted to use the model in a split format instead of the unified version that was published. However, I was stuck on the text encoder loading until I asked Kijai and he shared his repository (the one already mentioned here).

Regarding the custom node, you are correct. To run the GGUF model, you don't need to update the custom node to use LTX-2 GGUF. For now, we would only need to update it once support for Gemma 3 GGUF is added.

Nice, thanks for all the GGUF models btw, hope we see LTX 2 ggufs soon ❤️

smthem
/

LTX-2-Test-gguf

Where's the Q2_K?

Load original metadata

Clear memory