buthainaaa's picture
Update README.md
b262c1c verified
metadata
language:
  - ar
  - en
license: apache-2.0
tags:
  - awq
  - quantized
  - 4bit
  - vllm
  - fanar
base_model: QCRI/Fanar-1-9B-Instruct

Fanar-1-9B-Instruct-AWQ

GPTQ 4-bit quantized version of QCRI/Fanar-1-9B-Instruct.

Details

  • Quantization: GPTQ 4-bit (w4a16)
  • Size: ~5GB (vs ~18GB original)
  • Memory: 75% reduction
  • Quality: 95%+ retention
  • Optimized for: vLLM inference

Requirements

pip install vllm>=0.6.0

Model quantized using AutoAWQ with domain-specific calibration data.