poolside/Laguna-XS-2.1-DFlash-NVFP4

DFlash speculator for the NVFP4 target poolside/Laguna-XS-2.1-NVFP4. The speculator itself is a 5-layer Llama-style draft model (in BF16); pair it with the NVFP4 base for lower-latency serving.

Speculators for the other precisions are available in this collection: BF16, FP8, INT4.

See the Laguna XS 2.1 DFlash speculator card for architecture, training, and deployment. DFlash upstream support is in progress (vLLM #46853, SGLang #29446, TRT-LLM #15666). Use poolside/Laguna-XS-2.1-NVFP4 as the target model.

License

This model is licensed under the OpenMDW-1.1 License.

Intended and Responsible Use

Laguna-XS-2.1-DFlash-NVFP4 is designed for software engineering and agentic coding use cases, and you are responsible for confirming that it is appropriate for your intended application. Laguna-XS-2.1-DFlash-NVFP4 is subject to the OpenMDW-1.1 License, and should be used consistently with Poolside's Acceptable Use Policy.

Please report security vulnerabilities or safety concerns to security@poolside.ai.

Downloads last month
128
Safetensors
Model size
0.5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for poolside/Laguna-XS-2.1-DFlash-NVFP4

Finetuned
(1)
this model

Collection including poolside/Laguna-XS-2.1-DFlash-NVFP4