Neooooo
/

qf-integration-test

Text Generation

8-bit precision

compressed-tensors

Model card Files Files and versions

qf-integration-test / README.md

Neooooo's picture

Upload quant-forge artifacts

24027ea verified about 1 month ago

|

history blame contribute delete

935 Bytes

	---
	base_model: Qwen/Qwen3-30B-A3B
	datasets:
	- HuggingFaceH4/ultrachat_200k
	library_name: transformers
	pipeline_tag: text-generation
	quantized_by: QuantForge
	tags:
	- quantforge
	- quantized
	- nvfp4
	---

	# Neooooo/qf-integration-test

	## QuantForge Metadata

	- Base model: `Qwen/Qwen3-30B-A3B`
	- Quantization scheme: `nvfp4`
	- Calibration dataset: `HuggingFaceH4/ultrachat_200k`
	- Calibration samples: `32`
	- Max sequence length: `512`
	- Ignored layers: `lm_head, re:.\.mlp\.gate$, re:.\.mlp\.router$`

	## Accuracy (BF16 vs NVFP4)

	\| Task \| Metric \| BF16 \| NVFP4 \| Recovery \|
	\|---\|---:\|---:\|---:\|---:\|
	\| arc_challenge \| acc,none \| 0.4000 \| 0.3000 \| 0.750 \|
	\| hellaswag \| acc,none \| 0.4000 \| 0.4000 \| 1.000 \|

	Aggregate macro recovery: 0.875

	> Note: Scores estimated from subset.

	## Performance

	_Performance benchmark unavailable: evaluate.skip_perf=true_

	## Usage (vLLM)

	```bash
	vllm serve Neooooo/qf-integration-test
	```