Update README.md

d8e279f verified about 2 months ago

1.5 kB

pipeline_tag: text-generation
base_model:
  - Qwen/Qwen3-VL-4B-Instruct
metrics:
  - perplexity

Qwen3-VL-4B-Instruct-per-grp-quant

Introduction
This model was quantized using amd_quark-0.11
Quantization Strategy
- Quantized Layers: All linear layers
- Weight: uint4 asymmetric per-group with group_size=128.
Quick Start

Downalod the Qwen3-VL-4B-Instruct model.
Run the quantization script in the example folder using the following command line:
```
python run_qwen3_vl_4b_quant_model.py
```

Evaluation

Quark currently uses perplexity(PPL) as the evaluation metric for accuracy loss before and after quantization.The specific PPL algorithm can be referenced in the quantize_quark.py. The quantization evaluation results are conducted in pseudo-quantization mode, which may slightly differ from the actual quantized inference accuracy. These results are provided for reference only.

Evaluation scores

Benchmark	Qwen3-VL-4B-Instruct	Qwen3-VL-4B-Instruct-per-grp-quant (this model)
Perplexity-wikitext2	10.5369	11.6644

RyzenAI
/

Qwen3-VL-4B-Instruct-per-grp-quant

Qwen3-VL-4B-Instruct-per-grp-quant

Introduction

Quantization Strategy

Quick Start

Evaluation

Evaluation scores

License