Update README.md
Browse files
README.md
CHANGED
|
@@ -12,7 +12,7 @@ base_model:
|
|
| 12 |
|
| 13 |
> Notes: Keep `lm_head` in high precision; calibrate on long, domain-relevant sequences.
|
| 14 |
|
| 15 |
-
📘 About This Model
|
| 16 |
|
| 17 |
This is a quantized NVFP4 (W4A4) version of Step-Audio-R1, an open-weights Qwen2-Audio–based multimodal model for audio understanding and reasoning.
|
| 18 |
The original BF16 model requires ~67 GB VRAM.
|
|
@@ -39,7 +39,7 @@ High-quality step-by-step reasoning about audio inputs
|
|
| 39 |
|
| 40 |
It does not generate audio; it produces text based on audio input.
|
| 41 |
|
| 42 |
-
📦 What This Quantized Version Enables
|
| 43 |
|
| 44 |
This NVFP4 quantized version reduces memory requirements significantly:
|
| 45 |
|
|
|
|
| 12 |
|
| 13 |
> Notes: Keep `lm_head` in high precision; calibrate on long, domain-relevant sequences.
|
| 14 |
|
| 15 |
+
# 📘 About This Model
|
| 16 |
|
| 17 |
This is a quantized NVFP4 (W4A4) version of Step-Audio-R1, an open-weights Qwen2-Audio–based multimodal model for audio understanding and reasoning.
|
| 18 |
The original BF16 model requires ~67 GB VRAM.
|
|
|
|
| 39 |
|
| 40 |
It does not generate audio; it produces text based on audio input.
|
| 41 |
|
| 42 |
+
# 📦 What This Quantized Version Enables
|
| 43 |
|
| 44 |
This NVFP4 quantized version reduces memory requirements significantly:
|
| 45 |
|