Text-to-Video
Safetensors
MLX
Wan2.2
mlx-gen
mflux
apple-silicon
8-bit precision
mixed-q8-bf16
wan
video-generation
wan-a14b
Instructions to use AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir wan2.2-t2v-a14b-diffusers-8bit AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit
- Wan2.2
How to use AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit with Wan2.2:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
File size: 3,625 Bytes
25db82c 39ee5f1 25db82c 39ee5f1 25db82c 39ee5f1 25db82c 39ee5f1 25db82c 39ee5f1 25db82c 161bb7a 25db82c 39ee5f1 25db82c 39ee5f1 25db82c 39ee5f1 25db82c 39ee5f1 25db82c 39ee5f1 25db82c 39ee5f1 25db82c 39ee5f1 25db82c 39ee5f1 161bb7a 25db82c 161bb7a 39ee5f1 161bb7a 25db82c 39ee5f1 25db82c 39ee5f1 25db82c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | ---
license: apache-2.0
base_model: Wan-AI/Wan2.2-T2V-A14B-Diffusers
pipeline_tag: text-to-video
library_name: mlx-gen
tags:
- mlx
- mlx-gen
- mflux
- apple-silicon
- 8-bit
- mixed-q8-bf16
- wan
- wan2.2
- video-generation
- text-to-video
- wan-a14b
---
# wan2.2-t2v-a14b-diffusers-8bit
This repository contains mixed q8/BF16 MLX-Gen saved weights for
[`Wan-AI/Wan2.2-T2V-A14B-Diffusers`](https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B-Diffusers).
It is designed for local Apple Silicon inference with
[`mlx-gen`](https://github.com/lpalbou/mlx-gen).
It uses the mflux/MLX saved-weight layout with MLX quantization tensors. It is not a Diffusers or Transformers
`from_pretrained()` checkpoint.
## Source Model
Original model: [`Wan-AI/Wan2.2-T2V-A14B-Diffusers`](https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B-Diffusers).
This quantized derivative follows the Apache 2.0 license of the source model.
## Quantization
This is a mixed q8/BF16 checkpoint:
- q8 for quantizable Wan transformer block attention and feed-forward linears.
- BF16 for the Wan VAE.
- BF16 for Wan transformer conditioning/output projection linears, the UMT5 text encoder, scheduler metadata, tokenizer files, norms, convolutions, and other non-quantizable parameters.
This mixed policy is used because fully quantizing sensitive Wan A14B paths produced invalid or low-quality video in local validation.
## Validation
Measured on 2026-06-04 with `mlx-gen 0.18.9` on Apple Silicon. The upstream Diffusers source snapshot measured about 118 GiB in the local Hugging Face cache before preparing these packages. The table below reports prepared-package generation from model init through MP4 save and post-save video-health validation.
Validation profile: `384x224`, 33 frames, 12 denoising steps, guidance `4`, guidance-2 `3`, 8 fps, seed `4242`, `--low-ram`.
| Package | Disk | Full-Process Physical Peak | Max RSS | MLX Peak | Total Time | Video Health |
|---|---:|---:|---:|---:|---:|---|
| BF16 package | 64.3 GiB | 33.0 GiB | 31.8 GiB | 27.7 GiB | 152.7 s | 33/33 frames, 384x224, 8 fps, temporal delta 1.3 |
| This mixed q8/BF16 package | 39.7 GiB | 20.7 GiB | 19.5 GiB | 15.5 GiB | 154.8 s | 33/33 frames, 384x224, 8 fps, temporal delta 1.4 |
Compared with the BF16 prepared package at the same validation profile, this mixed q8/BF16 package reduces disk usage by about 38% and full-process physical peak memory by about 37%. Total time was about 1% slower in this run.
Physical peak is Darwin `ri_phys_footprint` sampled for the full process. The validation is intentionally small and repeatable; it is not a claim that every full-size `1280x720`, 81-frame, 40-step job has the same memory or timing profile.
## Usage
```bash
python -m pip install -U mlx-gen
mlxgen download --model AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit
mlxgen generate \
--model AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit \
--task text-to-video \
--prompt "A cinematic scene of a scientist working on agentic AI through the night, monitors glowing, papers shifting in a slow dolly shot." \
--width 384 \
--height 224 \
--frames 33 \
--steps 12 \
--guidance 4 \
--guidance-2 3 \
--fps 8 \
--seed 4242 \
--low-ram \
--metadata \
--output video.mp4
```
## Compatibility
Requires `mlx-gen >= 0.18.9`.
Generated with `mlx-gen 0.18.9`.
Use the `mlxgen` command and Python import path for new MLX-Gen projects.
## Attribution
MLX-Gen is based on [mflux](https://github.com/filipstrand/mflux) by Filip Strand and the original mflux contributors.
Quantized and contributed by [@lpalbou](https://huggingface.co/lpalbou).
|