metadata
language: en
license: apache-2.0
pipeline_tag: text-generation
tags:
- quantization
- nvfp4
- qwen
base_model: Qwen/Qwen3-8B
model_name: Qwen3-8B-NVFP4
Qwen3-8B-NVFP4
NVFP4-quantized version of Qwen/Qwen3-8B produced with llmcompressor.
Notes
- Quantization scheme: NVFP4 (linear layers,
lm_headexcluded) - Calibration samples: 512
- Max sequence length during calibration: 2048