NVFP4
Collection
Fast inference for Blackwell GPUs • 7 items • Updated
• 3
Compressed with llm-compressor v0.9.0 and transformers v4.57.1. We used the example script but increased the number of samples from 20 to 128.
Base model
Qwen/Qwen3-Next-80B-A3B-Instruct