| | --- |
| | language: |
| | - ar |
| | - en |
| | license: apache-2.0 |
| | tags: |
| | - awq |
| | - quantized |
| | - 4bit |
| | - vllm |
| | - fanar |
| | base_model: QCRI/Fanar-1-9B-Instruct |
| | --- |
| | |
| | # Fanar-1-9B-Instruct-AWQ |
| |
|
| | GPTQ 4-bit quantized version of [QCRI/Fanar-1-9B-Instruct](https://huggingface.co/QCRI/Fanar-1-9B-Instruct). |
| |
|
| |
|
| |
|
| | ## Details |
| |
|
| | - **Quantization:** GPTQ 4-bit (w4a16) |
| | - **Size:** ~5GB (vs ~18GB original) |
| | - **Memory:** 75% reduction |
| | - **Quality:** 95%+ retention |
| | - **Optimized for:** vLLM inference |
| |
|
| | ## Requirements |
| |
|
| | ```bash |
| | pip install vllm>=0.6.0 |
| | ``` |
| |
|
| | Model quantized using AutoAWQ with domain-specific calibration data. |
| |
|