Firworks
/

Step-Audio-R1-nvfp4

8-bit precision

compressed-tensors

Model card Files Files and versions

Firworks commited on Nov 30, 2025

Commit

935276f

·

verified ·

1 Parent(s): 906942f

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -45,7 +45,7 @@ This NVFP4 quantized version reduces memory requirements significantly:
 Size: ~22 GB (down from ~67 GB)
-Runs comfortably on a single RTX 5090
 Fully compatible with vLLM (including streaming text output)
@@ -82,7 +82,7 @@ docker run --rm -ti --gpus all \
     --trust-remote-code
 ```
-This example script should allow an audio wave full to be streamed to the model and get a response based on the prompt.
 ```py
 import requests
 import base64

 Size: ~22 GB (down from ~67 GB)
+Should fit comfortably on a single RTX 5090
 Fully compatible with vLLM (including streaming text output)
     --trust-remote-code
 ```
+This example script should allow an audio wave file to be streamed to the model and get a response based on the prompt.
 ```py
 import requests
 import base64