LTX-2.3 22B Distilled (MLX, 4-bit quantized)

MLX-optimized 4-bit quantized weights for Lightricks/LTX-2.3 video generation model. Fits on 32GB Apple Silicon Macs.

Also available in full float16 precision (66GB, for 64GB+ Macs).

Model Details

Model: LTX-2.3 22B Distilled (joint audio-video diffusion transformer)
Format: MLX safetensors (4-bit quantized transformer, float16 VAE/text encoder)
Transformer Size: 11GB (down from 39GB, 3.6x compression)
Total Size: ~35GB
Minimum RAM: 32GB Apple Silicon Mac
Original: Lightricks/LTX-2.3

Quantization Details

Method: Post-training quantization via mlx.nn.quantize()
Bits: 4-bit
Group Size: 64
Quantized Layers: All Linear layers in the transformer (22B params)
Full Precision: VAE decoder, upsampler, text encoder, normalization layers, embeddings

Benchmarks (M4 Max 128GB)

Head-to-head comparison on the same machine, same config (576x1024, 121 frames, 5s video):

	PyTorch MPS (BF16)	MLX Q4 (Video-only)	MLX FP16 (Video-only)
Stage 1 (8 steps, half-res)	66.7s	51.1s	51.2s
Stage 2 (3 steps, full-res)	157.9s	101.3s	101.3s
Total denoising	264.0s	154.3s	152.5s
Speedup	1.0x	1.7x	1.7x
Peak memory	>60 GB	29.9 GB	34.2 GB
Model size	46 GB	11 GB	39 GB

4-bit quantization achieves virtually identical speed to FP16 while reducing:

Model size: 39GB -> 11GB (3.6x smaller)
Peak memory: 34.2GB -> 29.9GB (fits on 32GB Macs)

Usage

from mlx_ltx.pipeline import DistilledPipeline, save_video

pipeline = DistilledPipeline("path/to/mlx-weights-q4")
video = pipeline(
    prompt="A cat surfing on ocean waves at sunset",
    height=576, width=1024, num_frames=121,
    seed=42,
)
save_video(video, "output.mp4", fps=24.0)

License

LTX-2 Community License

Downloads last month: 318

MLX

Hardware compatibility

4-bit

Model tree for gajesh/LTX-2.3-mlx-q4

Base model

Lightricks/LTX-2.3

Finetuned

(54)

this model