--- base_model: Qwen/Qwen3-30B-A3B datasets: - HuggingFaceH4/ultrachat_200k library_name: transformers pipeline_tag: text-generation quantized_by: QuantForge tags: - quantforge - quantized - nvfp4 --- # Neooooo/qf-integration-test ## QuantForge Metadata - Base model: `Qwen/Qwen3-30B-A3B` - Quantization scheme: `nvfp4` - Calibration dataset: `HuggingFaceH4/ultrachat_200k` - Calibration samples: `32` - Max sequence length: `512` - Ignored layers: `lm_head, re:.*\.mlp\.gate$, re:.*\.mlp\.router$` ## Accuracy (BF16 vs NVFP4) | Task | Metric | BF16 | NVFP4 | Recovery | |---|---:|---:|---:|---:| | arc_challenge | acc,none | 0.4000 | 0.3000 | 0.750 | | hellaswag | acc,none | 0.4000 | 0.4000 | 1.000 | Aggregate macro recovery: **0.875** > **Note:** Scores estimated from subset. ## Performance _Performance benchmark unavailable: evaluate.skip_perf=true_ ## Usage (vLLM) ```bash vllm serve Neooooo/qf-integration-test ```