Firworks
/

Step-Audio-R1-nvfp4

8-bit precision

compressed-tensors

Model card Files Files and versions

Firworks commited on Nov 30, 2025

Commit

130497c

·

verified ·

1 Parent(s): 935276f

Update README.md

Files changed (1) hide show

README.md +0 -13

README.md CHANGED Viewed

@@ -42,25 +42,12 @@ It does not generate audio; it produces text based on audio input.
 # 📦 What This Quantized Version Enables
 This NVFP4 quantized version reduces memory requirements significantly:
 Size: ~22 GB (down from ~67 GB)
 Should fit comfortably on a single RTX 5090
-Fully compatible with vLLM (including streaming text output)
 Preserves most reasoning performance from the BF16 release
 Because of this, anyone with a high-end consumer GPU can experiment with advanced audio reasoning locally.
-🖥 Supported Audio Behavior
-The model supports:
-✔ Streaming text output through vLLM
-✔ Reading uploaded audio files (WAV/MP3/etc) via ffmpeg
-✘ It does not synthesize audio
-✘ It does not require pre-burned waveforms — any user-provided audio file works
 Check the original model card for more information about this model.

 # 📦 What This Quantized Version Enables
 This NVFP4 quantized version reduces memory requirements significantly:
 Size: ~22 GB (down from ~67 GB)
 Should fit comfortably on a single RTX 5090
 Preserves most reasoning performance from the BF16 release
 Because of this, anyone with a high-end consumer GPU can experiment with advanced audio reasoning locally.
 Check the original model card for more information about this model.