poolside
/

Laguna-XS.2-NVFP4

Text Generation

8-bit precision

compressed-tensors

Model card Files Files and versions

Laguna-XS.2-NVFP4 / README.md

Commit History

Note FP8 KV cache needs vLLM 0.22.0; drop scrambled-output workaround (vllm#42650)

571346d

joerowell commited on 4 days ago

Update model card for 256K context length

12a5f20
verified

varunrandery commited on 14 days ago

Drop VLLM_USE_DEEP_GEMM=0 from vllm serve recipe (DeepGEMM is supported on Hopper and datacenter Blackwell)

514daf4
verified

joerowell commited on 19 days ago

Add base_model (#1)

2f694d9

mgoin commited on 30 days ago

Enable thinking by default in non-Hopper FP8-KV serve command

62a5860
verified

joerowell commited on Apr 29

Update non-Hopper FP8-KV serve command and link to vLLM recipes page

92f8b44
verified

joerowell commited on Apr 29

Laguna XS.2 upload

98ebde9

joerowell commited on Apr 28