--- license: apache-2.0 base_model: Wan-AI/Wan2.2-TI2V-5B-Diffusers pipeline_tag: text-to-video library_name: mlx-gen tags: - mlx - mlx-gen - mflux - apple-silicon - bf16 - wan - wan2.2 - video-generation - text-to-video - image-to-video --- # wan2.2-ti2v-5b-diffusers-bf16 This repository contains BF16 MLX-Gen saved weights for [`Wan-AI/Wan2.2-TI2V-5B-Diffusers`](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B-Diffusers). It is designed for local Apple Silicon inference with [`mlx-gen`](https://github.com/lpalbou/mlx-gen). It uses the mflux/MLX saved-weight layout. It is not a Diffusers or Transformers `from_pretrained()` checkpoint. ## Source Model Original model: [`Wan-AI/Wan2.2-TI2V-5B-Diffusers`](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B-Diffusers). This prepared derivative follows the Apache 2.0 license of the source model. ## Precision The upstream TI2V-5B source snapshot is not uniformly 16-bit on disk: the transformer and VAE safetensors are FP32, while the UMT5 text encoder is BF16. MLX-Gen loads Wan transformer/VAE weights at BF16 runtime precision, so this prepared BF16 package reduces storage and download size but is not a runtime-memory optimization versus source generation. Use this package when you want a smaller reusable MLX-Gen folder that preserves source behavior. Use the mixed q8/BF16 package when you want a smaller model footprint. ## Measurements Measured on 2026-06-04 with `mlx-gen 0.18.10` on an Apple M5 Max with 128 GiB unified memory. Validation profile: `1280x704`, 17 frames, 20 denoising steps, guidance `5`, 24 fps, seed `321`, explicit empty negative prompt. | Layout | Storage | Logical Model | Full-Process Physical Peak | Max RSS | MLX Peak | Total Time | Output | | --- | ---: | ---: | ---: | ---: | ---: | ---: | --- | | Upstream source snapshot | 31.9 GiB | 10.6 GiB | 102.7 GiB | 13.7 GiB | 58.5 GiB | 216.2 s | [base-source.mp4](validation/ti2v5b-clean/base-source.mp4) | | This BF16 package | 21.2 GiB | 10.6 GiB | 102.6 GiB | 14.5 GiB | 58.5 GiB | 261.6 s | [prepared-bf16.mp4](validation/ti2v5b-clean/prepared-bf16.mp4) | | Mixed q8/BF16 package | 16.9 GiB | 6.3 GiB | 103.7 GiB | 13.8 GiB | 54.2 GiB | 243.4 s | [mixed-q8-bf16.mp4](validation/ti2v5b-clean/mixed-q8-bf16.mp4) | The source and this BF16 package produced byte-identical decoded MP4 frames. The mixed q8/BF16 package stayed visually in the same family with mean frame MAE `1.66` versus source/BF16. `Storage` is the Hugging Face repository total. `Logical Model` is the loaded Wan transformer plus VAE tensor footprint measured from MLX arrays. `Full-Process Physical Peak` is Darwin `phys_footprint` sampled from model initialization through MP4 save and health validation. Validation assets: - [contact-sheet.png](validation/ti2v5b-clean/contact-sheet.png) - [metrics.json](validation/ti2v5b-clean/metrics.json) ## Usage ```bash python -m pip install -U mlx-gen mlxgen download --model AbstractFramework/wan2.2-ti2v-5b-diffusers-bf16 mlxgen generate \ --model AbstractFramework/wan2.2-ti2v-5b-diffusers-bf16 \ --prompt "A short cinematic video of a glowing orange glass sphere floating above calm teal water, soft reflections, gentle camera movement" \ --negative-prompt "" \ --width 1280 \ --height 704 \ --frames 17 \ --steps 20 \ --guidance 5 \ --fps 24 \ --seed 321 \ --output video.mp4 ``` TI2V-5B also supports first-frame image-to-video in MLX-Gen when one input image is supplied. ## Attribution MLX-Gen is based on [mflux](https://github.com/filipstrand/mflux) by Filip Strand and the original mflux contributors. Prepared and contributed by [@lpalbou](https://huggingface.co/lpalbou).