LTX-2.3 22B Distilled (MLX, 4-bit quantized)

MLX-optimized 4-bit quantized weights for Lightricks/LTX-2.3 video generation model. Fits on 32GB Apple Silicon Macs.

Also available in full float16 precision (66GB, for 64GB+ Macs).

Model Details

  • Model: LTX-2.3 22B Distilled (joint audio-video diffusion transformer)
  • Format: MLX safetensors (4-bit quantized transformer, float16 VAE/text encoder)
  • Transformer Size: 11GB (down from 39GB, 3.6x compression)
  • Total Size: ~35GB
  • Minimum RAM: 32GB Apple Silicon Mac
  • Original: Lightricks/LTX-2.3

Quantization Details

  • Method: Post-training quantization via mlx.nn.quantize()
  • Bits: 4-bit
  • Group Size: 64
  • Quantized Layers: All Linear layers in the transformer (22B params)
  • Full Precision: VAE decoder, upsampler, text encoder, normalization layers, embeddings

Benchmarks (M4 Max 128GB)

Head-to-head comparison on the same machine, same config (576x1024, 121 frames, 5s video):

PyTorch MPS (BF16) MLX Q4 (Video-only) MLX FP16 (Video-only)
Stage 1 (8 steps, half-res) 66.7s 51.1s 51.2s
Stage 2 (3 steps, full-res) 157.9s 101.3s 101.3s
Total denoising 264.0s 154.3s 152.5s
Speedup 1.0x 1.7x 1.7x
Peak memory >60 GB 29.9 GB 34.2 GB
Model size 46 GB 11 GB 39 GB

4-bit quantization achieves virtually identical speed to FP16 while reducing:

  • Model size: 39GB -> 11GB (3.6x smaller)
  • Peak memory: 34.2GB -> 29.9GB (fits on 32GB Macs)

Usage

from mlx_ltx.pipeline import DistilledPipeline, save_video

pipeline = DistilledPipeline("path/to/mlx-weights-q4")
video = pipeline(
    prompt="A cat surfing on ocean waves at sunset",
    height=576, width=1024, num_frames=121,
    seed=42,
)
save_video(video, "output.mp4", fps=24.0)

License

LTX-2 Community License

Downloads last month
-
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for gajesh/LTX-2.3-mlx-q4

Finetuned
(17)
this model