Drop VLLM_USE_DEEP_GEMM=0 from vllm serve recipe (DeepGEMM is supported on Hopper and datacenter Blackwell) 8c57f62 verified joerowell commited on 4 days ago
Enable thinking by default in non-Hopper FP8-KV serve command c7a758e verified joerowell commited on 25 days ago
Update non-Hopper FP8-KV serve command and link to vLLM recipes page 2c0d22b verified joerowell commited on 25 days ago