metadata
language:
- ar
- en
license: apache-2.0
tags:
- awq
- quantized
- 4bit
- vllm
- fanar
base_model: QCRI/Fanar-1-9B-Instruct
Fanar-1-9B-Instruct-AWQ
GPTQ 4-bit quantized version of QCRI/Fanar-1-9B-Instruct.
Details
- Quantization: GPTQ 4-bit (w4a16)
- Size: ~5GB (vs ~18GB original)
- Memory: 75% reduction
- Quality: 95%+ retention
- Optimized for: vLLM inference
Requirements
pip install vllm>=0.6.0
Model quantized using AutoAWQ with domain-specific calibration data.