Firworks
/

Step-Audio-R1-nvfp4

8-bit precision

compressed-tensors

Model card Files Files and versions

Firworks commited on Nov 30, 2025

Commit

92d9d71

·

verified ·

1 Parent(s): 93eac58

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ base_model:
 > Notes: Keep `lm_head` in high precision; calibrate on long, domain-relevant sequences.
-📘 About This Model
 This is a quantized NVFP4 (W4A4) version of Step-Audio-R1, an open-weights Qwen2-Audio–based multimodal model for audio understanding and reasoning.
 The original BF16 model requires ~67 GB VRAM.
@@ -39,7 +39,7 @@ High-quality step-by-step reasoning about audio inputs
 It does not generate audio; it produces text based on audio input.
-📦 What This Quantized Version Enables
 This NVFP4 quantized version reduces memory requirements significantly:

 > Notes: Keep `lm_head` in high precision; calibrate on long, domain-relevant sequences.
+# 📘 About This Model
 This is a quantized NVFP4 (W4A4) version of Step-Audio-R1, an open-weights Qwen2-Audio–based multimodal model for audio understanding and reasoning.
 The original BF16 model requires ~67 GB VRAM.
 It does not generate audio; it produces text based on audio input.
+# 📦 What This Quantized Version Enables
 This NVFP4 quantized version reduces memory requirements significantly: