qf-integration-test / README.md
Neooooo's picture
Upload quant-forge artifacts
24027ea verified
metadata
base_model: Qwen/Qwen3-30B-A3B
datasets:
  - HuggingFaceH4/ultrachat_200k
library_name: transformers
pipeline_tag: text-generation
quantized_by: QuantForge
tags:
  - quantforge
  - quantized
  - nvfp4

Neooooo/qf-integration-test

QuantForge Metadata

  • Base model: Qwen/Qwen3-30B-A3B
  • Quantization scheme: nvfp4
  • Calibration dataset: HuggingFaceH4/ultrachat_200k
  • Calibration samples: 32
  • Max sequence length: 512
  • Ignored layers: lm_head, re:.*\.mlp\.gate$, re:.*\.mlp\.router$

Accuracy (BF16 vs NVFP4)

Task Metric BF16 NVFP4 Recovery
arc_challenge acc,none 0.4000 0.3000 0.750
hellaswag acc,none 0.4000 0.4000 1.000

Aggregate macro recovery: 0.875

Note: Scores estimated from subset.

Performance

Performance benchmark unavailable: evaluate.skip_perf=true

Usage (vLLM)

vllm serve Neooooo/qf-integration-test