File size: 4,268 Bytes
4fdac1c
 
 
 
 
 
 
 
 
 
ef3f1f8
 
 
 
4fdac1c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ef3f1f8
4fdac1c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
---
license: other
license_name: tencent-hunyuan-community
license_link: https://huggingface.co/tencent/HunyuanImage-3.0/blob/main/LICENSE.txt
tags:
  - text-to-image
  - hunyuan
  - quantization
  - int8
  - comfyui
  - custom nodes
  - autoregressive
  - Dit
  - Hunyuan-Image-3
pipeline_tag: text-to-image
---

# Hunyuan Image 3.0 - INT8 Quantized

This is an **INT8 quantized version** of Tencent's [HunyuanImage-3.0](https://huggingface.co/tencent/HunyuanImage-3.0) model, optimized for high-end GPU workflows without CPU offloading.

## Model Description

INT8 quantization of the Hunyuan Image 3.0 text-to-image diffusion transformer, providing a balance between the full BF16 precision and more aggressive NF4 quantization. This version maintains excellent image quality while reducing memory requirements.

**Key Features:**
- 🎯 High quality output comparable to BF16
- 💾 ~80GB VRAM required (fits RTX 6000 Ada/Blackwell)
- ⚡ ~3.5 minutes generation time at base resolution
- 🔧 Designed for ComfyUI workflows

## VRAM Requirements

| Phase | VRAM Usage |
|-------|------------|
| Weight Loading | ~80 GB |
| Inference (additional) | ~12-20 GB |
| **Total** | **~92-100 GB** |

**Recommended Hardware:**
- NVIDIA RTX 6000 Ada (48GB) - requires model split/offload
- NVIDIA RTX 6000 Blackwell (96GB) - fits entirely in VRAM ✅  Workflows on the github page
- Multi-GPU setups with 80GB+ combined VRAM

## Usage

### ComfyUI (Recommended)

This model is designed to work with the [Comfy_HunyuanImage3](https://github.com/EricRollei/Comfy_HunyuanImage3) custom nodes:
```bash
cd ComfyUI/custom_nodes
git clone https://github.com/EricRollei/Comfy_HunyuanImage3
```

Install the nodes and download this model to your ComfyUI models directory. The nodes handle INT8 loading automatically.

### Direct Usage
```python
# INT8 weights can be loaded with standard torch quantization
# See the ComfyUI nodes for reference implementation
```

## Performance

- **Generation Time**: ~3.5 minutes for base resolution (1024x1024)
- **Weight Loading**: ~60 seconds (one-time per session)
- **Quality**: Excellent - minimal degradation from BF16
- **Speed**: Faster inference than BF16 due to reduced memory bandwidth

## Quantization Details

- **Method**: INT8 per-channel quantization
- **Target**: Hunyuan Image 3.0 transformer backbone
- **Precision Loss**: Minimal - image quality remains high
- **Trade-off**: Middle ground between NF4 (lower quality) and BF16 (highest VRAM)

## Original Model

This is a quantized derivative of [Tencent's HunyuanImage-3.0](https://huggingface.co/tencent/HunyuanImage-3.0).

**Original Model Details:**
- Architecture: Diffusion Transformer
- Resolution: Up to 2048x2048
- Language Support: English and Chinese prompts
- License: [Tencent Hunyuan Community License](https://huggingface.co/tencent/HunyuanImage-3.0/blob/main/LICENSE.txt)

Please review the original model card and license for full details on capabilities and restrictions.

## Limitations

- Requires high-end professional GPU (80GB+ VRAM)
- Not suitable for consumer GPUs (4090, 5090) without further optimization
- INT8 quantization may introduce minor quality differences in edge cases
- Loading time adds ~1 minute overhead to first generation

## Credits

**Original Model**: [Tencent Hunyuan Team](https://huggingface.co/tencent)
**Quantization**: Eric Rollei
**ComfyUI Integration**: [Comfy_HunyuanImage3](https://github.com/EricRollei/Comfy_HunyuanImage3)

## License

This model inherits the license from the original Hunyuan Image 3.0 model:
- **License**: [Tencent Hunyuan Community License](https://huggingface.co/tencent/HunyuanImage-3.0/blob/main/LICENSE.txt)
- Please review the original license for commercial use restrictions and requirements

## Citation
```bibtex
@misc{hunyuan-image-3-int8,
  author = {Rollei, Eric},
  title = {Hunyuan Image 3.0 INT8 Quantized},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/[YOUR_USERNAME]/[MODEL_NAME]}}
}
```

Original model citation:
```bibtex
@misc{tencent2024hunyuan,
  title={Hunyuan Image 3.0},
  author={Tencent Hunyuan Team},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/tencent/HunyuanImage-3.0}}
}
```