acnagle
/

Terminator-Qwen3-8B

@@ -20,7 +20,7 @@ Environment variables:
   NO_PREFIX_CACHING   — set to 1 to disable prefix caching
   VLLM_ENFORCE_EAGER  — set to 1 to disable CUDA graphs (default 0)
   REASONING_PARSER    — set to "qwen3" to enable <think>/</think> parsing
-                        (splits reasoning_content from content in API responses)
 Example:
   VLLM_MODEL=./model_dir python serve.py

   NO_PREFIX_CACHING   — set to 1 to disable prefix caching
   VLLM_ENFORCE_EAGER  — set to 1 to disable CUDA graphs (default 0)
   REASONING_PARSER    — set to "qwen3" to enable <think>/</think> parsing
+                        (splits `reasoning` from `content` in API responses)
 Example:
   VLLM_MODEL=./model_dir python serve.py