--- license: other license_name: tencent-hunyuan-community license_link: https://huggingface.co/tencent/HunyuanImage-3.0/blob/main/LICENSE.txt base_model: tencent/HunyuanImage-3.0-Instruct pipeline_tag: text-to-image library_name: transformers tags: - Hunyuan - hunyuan - quantization - int8 - comfyui - custom nodes - autoregressive - Dit - HunyuanImage-3.0 - instruct - image-editing - bitsandbytes --- # Hunyuan Image 3.0 Instruct — INT8 Quantized INT8 quantization of the HunyuanImage-3.0 Instruct model. Supports text-to-image, image editing, multi-image fusion, and Chain-of-Thought prompt enhancement (recaption/think_recaption). ## Key Features - 🎯 **Instruct model** — supports text-to-image, image editing, multi-image fusion - 🧠 **Chain-of-Thought** — built-in `think_recaption` mode for highest quality - 💾 **INT8 quantized** — ~81 GB on disk - ⚡ **50 diffusion steps** (full quality) - 🔧 **ComfyUI ready** — works with [Comfy_HunyuanImage3](https://github.com/EricRollei/Comfy_HunyuanImage3) nodes ## VRAM Requirements | Component | Memory | |-----------|--------| | Weight Loading | ~80 GB weights | | Inference (additional) | ~12-20 GB inference | | **Total** | **~92-100 GB** | **Recommended Hardware:** - **NVIDIA RTX 6000 Blackwell (96GB)** — fits entirely in VRAM ✅ - **NVIDIA RTX 6000 Ada (48GB)** — requires CPU offloading - Multi-GPU setups with 80GB+ combined VRAM ## Model Details - **Architecture:** HunyuanImage-3.0 Mixture-of-Experts Diffusion Transformer - **Parameters:** 80B total, 13B active per token (top-K MoE routing) - **Variant:** Instruct (Full) - **Quantization:** INT8 per-channel quantization via bitsandbytes - **Diffusion Steps:** 50 - **Default Guidance Scale:** 2.5 - **Resolution:** Up to 2048x2048 - **Language:** English and Chinese prompts ## Quantization Details **Layers quantized to INT8:** - Feed-forward networks (FFN/MLP layers) - Expert layers in MoE architecture (64 experts per layer) - Large linear transformations **Kept in full precision (BF16):** - VAE encoder/decoder (critical for image quality) - Attention projection layers (q_proj, k_proj, v_proj, o_proj) - Patch embedding layers - Time embedding layers - Vision model (SigLIP2) - Final output layers ## Usage ### ComfyUI (Recommended) This model is designed to work with the [Comfy_HunyuanImage3](https://github.com/EricRollei/Comfy_HunyuanImage3) custom nodes: ```bash cd ComfyUI/custom_nodes git clone https://github.com/EricRollei/Comfy_HunyuanImage3 ``` 1. Download this model to your ComfyUI models directory 2. Use the **"Hunyuan 3 Instruct Loader"** node 3. Select this model folder and choose `int8` precision 4. Connect to the **"Hunyuan 3 Instruct Generate"** node for text-to-image 5. Or use **"Hunyuan 3 Instruct Edit"** for image editing 6. Or use **"Hunyuan 3 Instruct Multi-Fusion"** for combining multiple images ### Bot Task Modes The Instruct model supports three generation modes: | Mode | Description | Speed | |------|-------------|-------| | `image` | Direct text-to-image, prompt used as-is | Fastest | | `recaption` | Model rewrites prompt into detailed description, then generates | Medium | | `think_recaption` | CoT reasoning → prompt enhancement → generation (best quality) | Slowest | ## Original Model This is a quantized derivative of [Tencent's HunyuanImage-3.0 Instruct](https://huggingface.co/tencent/HunyuanImage-3.0-Instruct). - **Architecture:** Diffusion Transformer with Mixture-of-Experts - **Resolution:** Up to 2048x2048 - **Language Support:** English and Chinese prompts - **License:** [Tencent Hunyuan Community License](https://huggingface.co/tencent/HunyuanImage-3.0/blob/main/LICENSE.txt) ## Limitations - Requires high-end professional GPU (~92-100 GB VRAM) - INT8 quantization may introduce minor quality differences in edge cases - Loading time adds ~1-2 minutes overhead to first generation - CoT/recaption modes require additional time for text generation phase ## Credits - **Original Model:** [Tencent Hunyuan Team](https://huggingface.co/tencent) - **Quantization:** Eric Rollei - **ComfyUI Integration:** [Comfy_HunyuanImage3](https://github.com/EricRollei/Comfy_HunyuanImage3) ## License This model inherits the license from the original Hunyuan Image 3.0 model: [Tencent Hunyuan Community License](https://huggingface.co/tencent/HunyuanImage-3.0/blob/main/LICENSE.txt) Please review the original license for commercial use restrictions and requirements. ## Citation ```bibtex @misc{hunyuan-image-3-int8-instruct, author = {Rollei, Eric}, title = {Hunyuan Image 3.0 Instruct — INT8 Quantized}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/EricRollei/HunyuanImage-3.0-Instruct-INT8}} } ```