Text-to-Video
Safetensors
MLX
Wan2.2
mlx-gen
mflux
apple-silicon
8-bit precision
mixed-q8-bf16
wan
video-generation
wan-a14b
Instructions to use AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir wan2.2-t2v-a14b-diffusers-8bit AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit
- Wan2.2
How to use AbstractFramework/wan2.2-t2v-a14b-diffusers-8bit with Wan2.2:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Put validation values directly in comparison table
Browse files
README.md
CHANGED
|
@@ -51,20 +51,14 @@ Bottom line:
|
|
| 51 |
- The BF16 package reduces storage, not runtime memory.
|
| 52 |
- This mixed q8/BF16 package reduces both storage and runtime memory. This is the package to use when generation memory footprint matters.
|
| 53 |
|
| 54 |
-
| Layout | Disk |
|
| 55 |
-
| --- | ---: | --- | --- |
|
| 56 |
-
| Original source snapshot | 118 GiB |
|
| 57 |
-
| BF16 package | 64 GiB |
|
| 58 |
-
| This mixed q8/BF16 package | 40 GiB |
|
| 59 |
|
| 60 |
Compared with the original source snapshot, this mixed q8/BF16 package cuts disk usage by about 66%, MLX peak memory by about 37%, and physical peak memory by about 35% in this validation run. It is not byte-identical to BF16, but the validation contact sheet stayed in the same visual family. The prepared q8/BF16 output was byte-identical to running `--quantize 8` from the upstream source snapshot.
|
| 61 |
|
| 62 |
-
Raw measurements:
|
| 63 |
-
|
| 64 |
-
- Original source snapshot: 32.99 GiB MLX peak, 48.90 GiB physical peak, 108.31 s.
|
| 65 |
-
- BF16 package: 32.98 GiB MLX peak, 45.12 GiB physical peak, 114.39 s.
|
| 66 |
-
- This mixed q8/BF16 package: 20.84 GiB MLX peak, 31.75 GiB physical peak, 110.34 s.
|
| 67 |
-
|
| 68 |
## Compatibility
|
| 69 |
|
| 70 |
Requires `mlx-gen >= 0.18.8`.
|
|
|
|
| 51 |
- The BF16 package reduces storage, not runtime memory.
|
| 52 |
- This mixed q8/BF16 package reduces both storage and runtime memory. This is the package to use when generation memory footprint matters.
|
| 53 |
|
| 54 |
+
| Layout | Disk | MLX Peak | Physical Peak | Time | Result |
|
| 55 |
+
| --- | ---: | ---: | ---: | ---: | --- |
|
| 56 |
+
| Original source snapshot | 118 GiB | 32.99 GiB | 48.90 GiB | 108.31 s | Baseline. |
|
| 57 |
+
| BF16 package | 64 GiB | 32.98 GiB | 45.12 GiB | 114.39 s | Storage only; output was byte-identical. |
|
| 58 |
+
| This mixed q8/BF16 package | 40 GiB | 20.84 GiB | 31.75 GiB | 110.34 s | Storage and memory; side-by-side quality validation passed. |
|
| 59 |
|
| 60 |
Compared with the original source snapshot, this mixed q8/BF16 package cuts disk usage by about 66%, MLX peak memory by about 37%, and physical peak memory by about 35% in this validation run. It is not byte-identical to BF16, but the validation contact sheet stayed in the same visual family. The prepared q8/BF16 output was byte-identical to running `--quantize 8` from the upstream source snapshot.
|
| 61 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
## Compatibility
|
| 63 |
|
| 64 |
Requires `mlx-gen >= 0.18.8`.
|