09/03: all formats of both dev & distilled in nvfp4, nvfp4mixed, fp8, and fp8mixed done.
07:03: ⚠️ make sure your torch (>=2.10) & cuda (>=13.0) are up to date, otherwise you won't see a speed boost using nvfp4, may even be slower ⚠️
05/03: Lightricks released an official fp8, probably better to use it instead of mine.
Various formats for LTX-2.3:
- fp8mixed: calibrated conversion, applying the same exclusion strategy as Lightricks (dev should almost the same as official release from Lightricks)
- fp8: direct conversion from Lightricks weights
- nvfp4mixed: calibrated conversion, applying the same exclusion strategy as Lightricks
- nvfp4: calibrated conversion, targeting all transformer blocks, still some weights kept in native formats (embeddings, norm, bias, etc)
run time
config: Windows 11, Nvidia 5090, torch2.10+cu130, SageAttention2.2
- 8 steps, CFG 1
- default sampler/scheduler
- 960 x 544p, 121 frames, 24 fps
- x2 spatial upscale
- 4 refining steps
| format | distilled | dev + distill lora |
|---|---|---|
| nvfp4 | 54s | 57s |
| nvfp4-mixed | TODO | 59s |
| fp8 (mine) | 61s | 61s |
| fp8 (Lightricks) | N/A | 63s |
| bf16 | 87s | 91s |
Hot run, prompt already processed.
Smaller formats look a bit "fuzzy" around the edges in the outputs, but motion, audio, and lighting are still looking good.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support