Update README.md
Browse files
README.md
CHANGED
|
@@ -18,8 +18,8 @@ base_model:
|
|
| 18 |
- **Input:** Text, Image
|
| 19 |
- **Output:** Text
|
| 20 |
- **Model Optimizations:**
|
| 21 |
-
- **Weight quantization:**
|
| 22 |
-
- **Activation quantization:**
|
| 23 |
- **Release Date:**
|
| 24 |
- **Version:** 1.0
|
| 25 |
- **Model Developers:**: Red Hat
|
|
@@ -29,7 +29,7 @@ Quantized version of [Qwen/Qwen3-VL-32B-Instruct](https://huggingface.co/Qwen/Qw
|
|
| 29 |
### Model Optimizations
|
| 30 |
|
| 31 |
This model was obtained by quantizing the weights and activations of [Qwen/Qwen3-VL-32B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct) to FP8 data type.
|
| 32 |
-
This optimization reduces the number of bits per parameter from 16 to
|
| 33 |
Only the weights and activations of the linear operators within transformers blocks of the language model are quantized.
|
| 34 |
|
| 35 |
|
|
|
|
| 18 |
- **Input:** Text, Image
|
| 19 |
- **Output:** Text
|
| 20 |
- **Model Optimizations:**
|
| 21 |
+
- **Weight quantization:** FP4
|
| 22 |
+
- **Activation quantization:** FP4
|
| 23 |
- **Release Date:**
|
| 24 |
- **Version:** 1.0
|
| 25 |
- **Model Developers:**: Red Hat
|
|
|
|
| 29 |
### Model Optimizations
|
| 30 |
|
| 31 |
This model was obtained by quantizing the weights and activations of [Qwen/Qwen3-VL-32B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct) to FP8 data type.
|
| 32 |
+
This optimization reduces the number of bits per parameter from 16 to 4, reducing the disk size and GPU memory requirements by approximately 75%.
|
| 33 |
Only the weights and activations of the linear operators within transformers blocks of the language model are quantized.
|
| 34 |
|
| 35 |
|