Commit History

Drop VLLM_USE_DEEP_GEMM=0 from vllm serve recipe (DeepGEMM is supported on Hopper and datacenter Blackwell)
514daf4
verified

joerowell commited on

Enable thinking by default in non-Hopper FP8-KV serve command
62a5860
verified

joerowell commited on

Update non-Hopper FP8-KV serve command and link to vLLM recipes page
92f8b44
verified

joerowell commited on

Sync chat template with v5_1 (matches base/FP8/INT4)
c6a0e4c
verified

joerowell commited on

Laguna XS.2 upload
98ebde9

joerowell commited on