Update README.md

b262c1c verified 6 days ago

577 Bytes

language:
  - ar
  - en
license: apache-2.0
tags:
  - awq
  - quantized
  - 4bit
  - vllm
  - fanar
base_model: QCRI/Fanar-1-9B-Instruct

Fanar-1-9B-Instruct-AWQ

Quantization: GPTQ 4-bit (w4a16)
Size: ~5GB (vs ~18GB original)
Memory: 75% reduction
Quality: 95%+ retention
Optimized for: vLLM inference

GPTQ 4-bit quantized version of QCRI/Fanar-1-9B-Instruct.

Details

pip install vllm>=0.6.0

Model quantized using AutoAWQ with domain-specific calibration data.