Drop VLLM_USE_DEEP_GEMM=0 from vllm serve recipe (DeepGEMM is supported on Hopper and datacenter Blackwell) 514daf4 verified joerowell commited on 1 day ago
Enable thinking by default in non-Hopper FP8-KV serve command 62a5860 verified joerowell commited on 22 days ago
Update non-Hopper FP8-KV serve command and link to vLLM recipes page 92f8b44 verified joerowell commited on 22 days ago
Sync chat template with v5_1 (matches base/FP8/INT4) c6a0e4c verified joerowell commited on 23 days ago