Laguna-XS.2-NVFP4 / README.md

Commit History

Note FP8 KV cache needs vLLM 0.22.0; drop scrambled-output workaround (vllm#42650)
571346d

joerowell commited on

Update model card for 256K context length
12a5f20
verified

varunrandery commited on

Drop VLLM_USE_DEEP_GEMM=0 from vllm serve recipe (DeepGEMM is supported on Hopper and datacenter Blackwell)
514daf4
verified

joerowell commited on

Enable thinking by default in non-Hopper FP8-KV serve command
62a5860
verified

joerowell commited on

Update non-Hopper FP8-KV serve command and link to vLLM recipes page
92f8b44
verified

joerowell commited on

Laguna XS.2 upload
98ebde9

joerowell commited on