fix(vllm): only pass documents via chat_template_kwargs, not top-level 3034360 Running verified msradam commited on 15 days ago
fix(vllm): strip document-role messages before sending to vLLM 80deb38 verified msradam commited on 15 days ago
fix: conditional Ollama timeout — 5s when vLLM primary, 240s otherwise f4632b7 verified msradam commited on 15 days ago
fix: vLLM readiness probe + fast Ollama fallback timeout (app/llm.py) 3368898 verified msradam commited on 15 days ago
fix(llm): raise vLLM first-token timeout to 360s for RunPod cold start d79c380 verified msradam commited on 15 days ago