| --- |
| base_model: Qwen/Qwen3-30B-A3B |
| datasets: |
| - HuggingFaceH4/ultrachat_200k |
| library_name: transformers |
| pipeline_tag: text-generation |
| quantized_by: QuantForge |
| tags: |
| - quantforge |
| - quantized |
| - nvfp4 |
| --- |
| |
| # Neooooo/qf-integration-test |
|
|
| ## QuantForge Metadata |
|
|
| - Base model: `Qwen/Qwen3-30B-A3B` |
| - Quantization scheme: `nvfp4` |
| - Calibration dataset: `HuggingFaceH4/ultrachat_200k` |
| - Calibration samples: `32` |
| - Max sequence length: `512` |
| - Ignored layers: `lm_head, re:.*\.mlp\.gate$, re:.*\.mlp\.router$` |
|
|
| ## Accuracy (BF16 vs NVFP4) |
|
|
| | Task | Metric | BF16 | NVFP4 | Recovery | |
| |---|---:|---:|---:|---:| |
| | arc_challenge | acc,none | 0.4000 | 0.3000 | 0.750 | |
| | hellaswag | acc,none | 0.4000 | 0.4000 | 1.000 | |
| |
| Aggregate macro recovery: **0.875** |
| |
| > **Note:** Scores estimated from subset. |
| |
| ## Performance |
| |
| _Performance benchmark unavailable: evaluate.skip_perf=true_ |
|
|
| ## Usage (vLLM) |
|
|
| ```bash |
| vllm serve Neooooo/qf-integration-test |
| ``` |
|
|