Qwen3-8B-INT8
INT8 (W8A8) quantized version of Qwen/Qwen3-8B, created using llm-compressor with calibrated quantization.
Overview
| Property | Value |
|---|---|
| Base Model | Qwen/Qwen3-8B |
| Parameters | 8.19B |
| Quantization | INT8 (W8A8) |
| Format | compressed-tensors |
| Tool | llm-compressor |
| Disk Size | ~9.4 GB (2 shards) |
Intended Use
Quantized text encoder for Flux 2 Klein 9B image generation pipelines. Architecturally identical to the Klein 9B text encoder.
Quantization Details
- Scheme: W8A8 — 8-bit integer weights and activations
- Targets: All
Linearlayers (excludinglm_head) - Calibration: 256 samples from C4, sequential pipeline with CPU offloading
Hardware Requirements
- Minimum: Any CUDA GPU with INT8 tensor core support
- Fallback: Dequantizes to BF16 on unsupported hardware
- Downloads last month
- 17