About Aspect ratio/resolutions and distillation steps/LORAs

#24

by Privac - opened 6 days ago

•

The original LTX 2.3 model uses ( cit. https://huggingface.co/spaces/Lightricks/LTX-2-3-hdr/blob/main/app.py line 114-120)

"# Frames must satisfy (n-1) % 8 == 0. Aspect-ratio canvas sizes (divisible by 32).
RESOLUTIONS = {
"low": {"16:9": (768, 512), "9:16": (512, 768), "1:1": (768, 768),
"4:3": (768, 576), "3:4": (576, 768), "21:9": (768, 384)},
"high": {"16:9": (1536, 1024), "9:16": (1024, 1536), "1:1": (1024, 1024),
"4:3": (1536, 1152), "3:4": (1152, 1536), "21:9": (1536, 768)},
}".
Does this finetune/retraining use the same? Also the distilled variant of the original model uses specific sigmas for inference, are they still the same? One last question is the lora weights seems to work slightly different when applied to a GGUF version compared to BF16/FP8, is it true or am I mistaken?

Privac changed discussion title from About Aspect ratio/resolutions and distillation steps to About Aspect ratio/resolutions and distillation steps/LORAs 6 days ago

TenStrip

Owner 5 days ago

All of their defaults are more like guidelines, the model can do any resolution x32 and up to 40 seconds long. Their sigmas are for a full 384 distilled base model. You get much better results in these with a lesser distilled influence and more quality sigmas like in my workflow (13 step sigmas first pass), you want at least 10 steps probably most people running 8 step workflows describe constant quality issues and the fix is simply to add ~20s of first stage sampling for night/day quality gain.

plz12345

5 days ago

Also, some SDKs like ltx-2-mlx will round the dimension numbers to the closest lower number divisible by 32. I've yet to hit an error on the 8n+1 rule, but yeah, I think both are strongly encouraged guidelines to ensure good output.

TenStrip

Owner 5 days ago

Also, some SDKs like ltx-2-mlx will round the dimension numbers to the closest lower number divisible by 32. I've yet to hit an error on the 8n+1 rule, but yeah, I think both are strongly encouraged guidelines to ensure good output.

Frame slider in workflow is by 24 with the +1 math behind it, dimension inputs are all 32 step.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment