--- license: other license_name: tencent-hunyuan-community license_link: LICENSE base_model: tencent/HunyuanImage-3.0 tags: - image-generation - int8 - quantized - bitsandbytes - hunyuan - text-to-image library_name: transformers pipeline_tag: text-to-image --- # HunyuanImage-3 Base INT8 INT8 quantized version of [tencent/HunyuanImage-3.0](https://huggingface.co/tencent/HunyuanImage-3.0) using bitsandbytes. Reduces model size from ~160GB (BF16) to ~81GB while maintaining quality. ## Model Details - **Architecture**: ~130B parameter Mixture-of-Experts (MoE) with 64 experts, top-8 routing - **Quantization**: INT8 via bitsandbytes `Linear8bitLt` on transformer linear layers - **Original precision**: BF16 → INT8 (VAE, vision model, and embeddings remain in full precision) - **Variant**: Base (text-to-image only, 20 diffusion steps, no classifier-free guidance) ### Quality Notes INT8 quantization preserves the model's strengths remarkably well — generated images feature correct anatomy, proper finger counts, and strong resistance to extra limbs and other common AI artifacts. The Base INT8 variant performs particularly well on a 96GB Blackwell GPU (~4 minutes per image at 1024x1024). ## Usage ### With the generation scripts The easiest way to use this model is with the companion [generation scripts](https://github.com/jamesw767/hunyuan-image-int8): ```bash git clone https://github.com/jamesw767/hunyuan-image-int8.git cd hunyuan-image-int8 pip install -r requirements.txt # Download this model huggingface-cli download jamesw767/HunyuanImage-3-Base-INT8 \ --local-dir ./HunyuanImage-3-Base-INT8 # Generate python generate.py \ --model-path ./HunyuanImage-3-Base-INT8 \ --prompt "A red fox sitting in autumn leaves, realistic photography" ``` ### Direct loading with transformers ```python from transformers import AutoModelForCausalLM, BitsAndBytesConfig import torch model_path = "jamesw767/HunyuanImage-3-Base-INT8" model = AutoModelForCausalLM.from_pretrained( model_path, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) ``` **Note**: Direct loading requires the exception-based memory management trick to handle VAE decode — see the [generation scripts repo](https://github.com/jamesw767/hunyuan-image-int8) for the full pipeline. ## How It Was Made The INT8 weights were created using `save_quantized.py` from the generation scripts: 1. Load the BF16 model with `BitsAndBytesConfig(load_in_8bit=True)` 2. Extract the quantized state dict, resolving meta tensors from accelerate's CPU offload hooks 3. Save as sharded safetensors (5GB per shard, 17 shards total) Modules excluded from INT8 quantization (kept in original precision): `vae`, `vision_model`, `vision_aligner`, `patch_embed`, `final_layer`, `time_embed`, `time_embed_2`, `timestep_emb`, `guidance_emb`, `timestep_r_emb`, `lm_head`, `model.wte`, `model.ln_f` ## GPU Requirements - **96GB VRAM recommended**: RTX PRO 6000 Blackwell, A100 80GB+, H100 - **48GB+ VRAM**: May work with aggressive CPU offloading via `--gpu-budget` / `--cpu-budget` - **System RAM**: 64GB+ recommended (offloaded layers use CPU memory) During diffusion, KV cache and MoE activations expand to ~80GB regardless of model weight placement. The generation scripts use an exception-based stack unwinding trick to free this memory before VAE decode. ## Differences from Instruct/Distil Variants | | Base | Instruct | Instruct-Distil | |---|---|---|---| | Steps | 20 | 50 | 8 | | CFG | No | Yes | No | | Chat format | No | Yes | Yes | | Speed (96GB GPU) | ~4 min | ~13 min | ~90s | ## Other INT8 Models - [jamesw767/HunyuanImage-3-Instruct-INT8](https://huggingface.co/jamesw767/HunyuanImage-3-Instruct-INT8) — Full Instruct, 50 steps, highest quality - [jamesw767/HunyuanImage-3-Instruct-Distil-INT8](https://huggingface.co/jamesw767/HunyuanImage-3-Instruct-Distil-INT8) — Distilled, 8 steps, fastest ## License This model is a derivative of [tencent/HunyuanImage-3.0](https://huggingface.co/tencent/HunyuanImage-3.0), released under the [Tencent Hunyuan Community License](LICENSE). **Important**: This license does not apply in the European Union, United Kingdom, or South Korea. Tencent Hunyuan is licensed under the Tencent Hunyuan Community License Agreement, Copyright (c) 2025 Tencent. All Rights Reserved. The trademark rights of "Tencent Hunyuan" are owned by Tencent or its affiliate. ## Credits - Original model by [Tencent Hunyuan](https://github.com/Tencent-Hunyuan/HunyuanImage-3.0) - INT8 quantization and generation scripts by [jamesw767](https://github.com/jamesw767/hunyuan-image-int8)