Spaces:

plarnholt
/

excom-ai-demo

Paused

Peter Larnholt commited on Oct 9

Commit

65659d1

1 Parent(s): fa4aba4

Upgrade vLLM to 0.6.4.post1 and remove explicit outlines dependencies

- Upgrade from 0.6.3.post1 to 0.6.4.post1 for bug fixes and stability
- Remove explicit outlines/airportsdata - let vLLM manage its own deps
- vLLM 0.6.4.post1 has better outlines integration and may fix the
silent 500 error during text generation

Files changed (2) hide show

app.py +2 -1
requirements.txt +1 -5

app.py CHANGED Viewed

@@ -32,7 +32,8 @@ if "AWQ" in MODEL_ID.upper():
 def launch_vllm():
     print(f"[vLLM] Launch: {MODEL_ID}")
-    subprocess.Popen(VLLM_ARGS)
 def wait_vllm_ready(timeout=900, interval=3):
     url = f"http://127.0.0.1:{API_PORT}/v1/models"

 def launch_vllm():
     print(f"[vLLM] Launch: {MODEL_ID}")
+    # Capture stderr to see any crashes/errors during generation
+    subprocess.Popen(VLLM_ARGS, stderr=subprocess.STDOUT)
 def wait_vllm_ready(timeout=900, interval=3):
     url = f"http://127.0.0.1:{API_PORT}/v1/models"

requirements.txt CHANGED Viewed

@@ -4,12 +4,8 @@ gradio>=4.38
 requests>=2.31
 # vLLM + CUDA 12.1
-vllm==0.6.3.post1
 --extra-index-url https://download.pytorch.org/whl/cu121
 torch==2.4.0
 transformers>=4.44
 accelerate>=0.30
-# Required for vLLM's outlines guided decoding backend
-outlines>=0.0.37
-airportsdata>=20240400

 requests>=2.31
 # vLLM + CUDA 12.1
+vllm==0.6.4.post1
 --extra-index-url https://download.pytorch.org/whl/cu121
 torch==2.4.0
 transformers>=4.44
 accelerate>=0.30