LTX-2.3 22B Distilled (MLX, 4-bit quantized)
MLX-optimized 4-bit quantized weights for Lightricks/LTX-2.3 video generation model. Fits on 32GB Apple Silicon Macs.
Also available in full float16 precision (66GB, for 64GB+ Macs).
Model Details
- Model: LTX-2.3 22B Distilled (joint audio-video diffusion transformer)
- Format: MLX safetensors (4-bit quantized transformer, float16 VAE/text encoder)
- Transformer Size: 11GB (down from 39GB, 3.6x compression)
- Total Size: ~35GB
- Minimum RAM: 32GB Apple Silicon Mac
- Original: Lightricks/LTX-2.3
Quantization Details
- Method: Post-training quantization via
mlx.nn.quantize() - Bits: 4-bit
- Group Size: 64
- Quantized Layers: All Linear layers in the transformer (22B params)
- Full Precision: VAE decoder, upsampler, text encoder, normalization layers, embeddings
Benchmarks (M4 Max 128GB)
Head-to-head comparison on the same machine, same config (576x1024, 121 frames, 5s video):
| PyTorch MPS (BF16) | MLX Q4 (Video-only) | MLX FP16 (Video-only) | |
|---|---|---|---|
| Stage 1 (8 steps, half-res) | 66.7s | 51.1s | 51.2s |
| Stage 2 (3 steps, full-res) | 157.9s | 101.3s | 101.3s |
| Total denoising | 264.0s | 154.3s | 152.5s |
| Speedup | 1.0x | 1.7x | 1.7x |
| Peak memory | >60 GB | 29.9 GB | 34.2 GB |
| Model size | 46 GB | 11 GB | 39 GB |
4-bit quantization achieves virtually identical speed to FP16 while reducing:
- Model size: 39GB -> 11GB (3.6x smaller)
- Peak memory: 34.2GB -> 29.9GB (fits on 32GB Macs)
Usage
from mlx_ltx.pipeline import DistilledPipeline, save_video
pipeline = DistilledPipeline("path/to/mlx-weights-q4")
video = pipeline(
prompt="A cat surfing on ocean waves at sunset",
height=576, width=1024, num_frames=121,
seed=42,
)
save_video(video, "output.mp4", fps=24.0)
License
- Downloads last month
- -
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for gajesh/LTX-2.3-mlx-q4
Base model
Lightricks/LTX-2.3