buthainaaa's picture
Update README.md
b262c1c verified
---
language:
- ar
- en
license: apache-2.0
tags:
- awq
- quantized
- 4bit
- vllm
- fanar
base_model: QCRI/Fanar-1-9B-Instruct
---
# Fanar-1-9B-Instruct-AWQ
GPTQ 4-bit quantized version of [QCRI/Fanar-1-9B-Instruct](https://huggingface.co/QCRI/Fanar-1-9B-Instruct).
## Details
- **Quantization:** GPTQ 4-bit (w4a16)
- **Size:** ~5GB (vs ~18GB original)
- **Memory:** 75% reduction
- **Quality:** 95%+ retention
- **Optimized for:** vLLM inference
## Requirements
```bash
pip install vllm>=0.6.0
```
Model quantized using AutoAWQ with domain-specific calibration data.