Spaces:

gyrmo
/

CitizenClimate

Sleeping

gyrmo commited on Feb 26

Commit

9b002ee

verified ·

1 Parent(s): 5788d26

I now have more GPU, therefore I have now reduced the GPU utililisation to 0.8

Files changed (1) hide show

vllm_server.py CHANGED Viewed

@@ -34,7 +34,7 @@ def start_vllm():
         "--model", model_name,
         "--port", "8000",
         "--host", "0.0.0.0",
-        "--gpu-memory-utilization", "0.9",
         "--max-model-len", "4096",
         "--max-num-seqs", "8",
         "--swap-space", "4",

         "--model", model_name,
         "--port", "8000",
         "--host", "0.0.0.0",
+        "--gpu-memory-utilization", "0.8",
         "--max-model-len", "4096",
         "--max-num-seqs", "8",
         "--swap-space", "4",