Update README.md
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ For more information, please read our [blog post](https://bfl.ai/blog/flux2-klei
|
|
| 24 |
|
| 25 |
This model is a quantized version optimized for efficient inference:
|
| 26 |
|
| 27 |
-
- **Transformer**: Quantized using TorchAo
|
| 28 |
- **Text Encoder**: Replaced with `unsloth/Qwen3-4B-unsloth-bnb-4bit`, a 4-bit quantized version that further reduces memory requirements.
|
| 29 |
- **Memory Usage**: Peak VRAM consumption is approximately **9GB**.
|
| 30 |
- **Performance**: Generates images in approximately **0.1 seconds** (4 steps)on RTX 5090 GPUs.
|
|
|
|
| 24 |
|
| 25 |
This model is a quantized version optimized for efficient inference:
|
| 26 |
|
| 27 |
+
- **Transformer**: Quantized using TorchAo int8 (int8wo) quantization, significantly reducing model size while maintaining generation quality.
|
| 28 |
- **Text Encoder**: Replaced with `unsloth/Qwen3-4B-unsloth-bnb-4bit`, a 4-bit quantized version that further reduces memory requirements.
|
| 29 |
- **Memory Usage**: Peak VRAM consumption is approximately **9GB**.
|
| 30 |
- **Performance**: Generates images in approximately **0.1 seconds** (4 steps)on RTX 5090 GPUs.
|