acnagle commited on
Commit
8ff87e7
·
verified ·
1 Parent(s): e801eea

Upload serve.py with huggingface_hub

Browse files
Files changed (1) hide show
  1. serve.py +1 -1
serve.py CHANGED
@@ -20,7 +20,7 @@ Environment variables:
20
  NO_PREFIX_CACHING — set to 1 to disable prefix caching
21
  VLLM_ENFORCE_EAGER — set to 1 to disable CUDA graphs (default 0)
22
  REASONING_PARSER — set to "qwen3" to enable <think>/</think> parsing
23
- (splits reasoning_content from content in API responses)
24
 
25
  Example:
26
  VLLM_MODEL=./model_dir python serve.py
 
20
  NO_PREFIX_CACHING — set to 1 to disable prefix caching
21
  VLLM_ENFORCE_EAGER — set to 1 to disable CUDA graphs (default 0)
22
  REASONING_PARSER — set to "qwen3" to enable <think>/</think> parsing
23
+ (splits `reasoning` from `content` in API responses)
24
 
25
  Example:
26
  VLLM_MODEL=./model_dir python serve.py