Tonera commited on
Commit
ed078b8
·
verified ·
1 Parent(s): ae4c1a2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -24,7 +24,7 @@ For more information, please read our [blog post](https://bfl.ai/blog/flux2-klei
24
 
25
  This model is a quantized version optimized for efficient inference:
26
 
27
- - **Transformer**: Quantized using TorchAo fp8 (float8wo) quantization, significantly reducing model size while maintaining generation quality.
28
  - **Text Encoder**: Replaced with `unsloth/Qwen3-4B-unsloth-bnb-4bit`, a 4-bit quantized version that further reduces memory requirements.
29
  - **Memory Usage**: Peak VRAM consumption is approximately **9GB**.
30
  - **Performance**: Generates images in approximately **0.1 seconds** (4 steps)on RTX 5090 GPUs.
 
24
 
25
  This model is a quantized version optimized for efficient inference:
26
 
27
+ - **Transformer**: Quantized using TorchAo int8 (int8wo) quantization, significantly reducing model size while maintaining generation quality.
28
  - **Text Encoder**: Replaced with `unsloth/Qwen3-4B-unsloth-bnb-4bit`, a 4-bit quantized version that further reduces memory requirements.
29
  - **Memory Usage**: Peak VRAM consumption is approximately **9GB**.
30
  - **Performance**: Generates images in approximately **0.1 seconds** (4 steps)on RTX 5090 GPUs.