Firworks commited on
Commit
935276f
·
verified ·
1 Parent(s): 906942f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -45,7 +45,7 @@ This NVFP4 quantized version reduces memory requirements significantly:
45
 
46
  Size: ~22 GB (down from ~67 GB)
47
 
48
- Runs comfortably on a single RTX 5090
49
 
50
  Fully compatible with vLLM (including streaming text output)
51
 
@@ -82,7 +82,7 @@ docker run --rm -ti --gpus all \
82
  --trust-remote-code
83
  ```
84
 
85
- This example script should allow an audio wave full to be streamed to the model and get a response based on the prompt.
86
  ```py
87
  import requests
88
  import base64
 
45
 
46
  Size: ~22 GB (down from ~67 GB)
47
 
48
+ Should fit comfortably on a single RTX 5090
49
 
50
  Fully compatible with vLLM (including streaming text output)
51
 
 
82
  --trust-remote-code
83
  ```
84
 
85
+ This example script should allow an audio wave file to be streamed to the model and get a response based on the prompt.
86
  ```py
87
  import requests
88
  import base64