### πŸš€ WanVideo Model Suite **Combined & Quantized Models for ComfyUI Workflows** *Derived from `Wan-AI/Wan2.1-VACE-14B`* --- ## πŸ“‹ Overview This repository provides optimized models for [**WanVideo**](https://github.com/kijai/ComfyUI-WanVideoWrapper)β€”a high-fidelity video generation framework. Models are quantized to balance performance and resource efficiency while retaining visual quality. Designed for seamless integration with ComfyUI via: - **[WanVideo Wrapper](https://github.com/kijai/ComfyUI-WanVideoWrapper)** (Third-party extension) - Native **WanVideo nodes** in ComfyUI --- ## πŸ”§ Key Components ### 1. **Core Diffusion Models** | File | Size | Description | |------|------|-------------| | `wan2.1_i2v_720p_14B_fp8_e4m3fn.safetensors` | Quantized (FP8) | Base video generation model (14B params, 720p). | | `fantasytalking_fp16.safetensors` | FP16 | Specialized model for expressive dialogue animation. | ### 2. **Text & Vision Encoders** | File | Type | Role | |------|------|------| | `umt5-xxl-enc-bf16.safetensors` | Text Encoder (UMT5-XXL) | BF16 precision for multilingual text understanding. | | `clip_vision_h.safetensors` | Vision Encoder | Processes visual inputs for conditional generation. | --- ## πŸ“ ComfyUI Setup Guide Place files in these directories within your ComfyUI installation: ```bash models/ β”œβ”€β”€ diffusion_models/ β”‚ β”œβ”€β”€ wan2.1_i2v_720p_14B_fp8_e4m3fn.safetensors β”‚ └── fantasytalking_fp16.safetensors β”œβ”€β”€ clip_vision/ β”‚ └── clip_vision_h.safetensors └── text_encoders/ └── umt5-xxl-enc-bf16.safetensors ``` --- ## πŸ”— Dependencies & Resources 1. **Vision Encoder Resources** - Download `clip_vision_h.safetensors` from: [Comfy-Org/Wan_2.1_ComfyUI_repackaged](https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/tree/main/split_files/clip_vision) 2. **FantasyTalking Model** - Source code & usage: [GitHub Repository](https://github.com/Fantasy-AMAP/fantasy-talking) 3. **Base Model** - Full precision version: [Wan-AI/Wan2.1-VACE-14B](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B) --- ## πŸ’‘ Usage Notes - **Quantization Benefits**: FP8 reduces VRAM usage by ~50% vs FP16, enabling 720p generation on consumer GPUs. - **Workflow Compatibility**: Combine with `Text-to-Video`, `Image-to-Video`, or `FantasyTalking` nodes in ComfyUI. - **Multi-Modal Inputs**: UMT5-XXL encoder supports multilingual prompts (e.g., English, Chinese). --- ## βš–οΈ License *Inherited from parent models ([Check Wan-AI License](https://huggingface.co/Wan-AI/Wan2.1-VACE-14B)). Non-commercial/research use recommended pending verification.* --- **✨ Pro Tip**: For optimal results, pair with WanVideo’s temporal consistency modules to reduce frame flickering in long sequences. --- *Model Card curated by the ComfyUI community. Maintained for reproducibility and ease of deployment.*