buthainaaa
/

Fanar-1-9B-Instruct-GPTQ

compressed-tensors

Model card Files Files and versions

Fanar-1-9B-Instruct-GPTQ / README.md

buthainaaa's picture

Update README.md

b262c1c verified 6 days ago

|

history blame contribute delete

577 Bytes

	---
	language:
	- ar
	- en
	license: apache-2.0
	tags:
	- awq
	- quantized
	- 4bit
	- vllm
	- fanar
	base_model: QCRI/Fanar-1-9B-Instruct
	---

	# Fanar-1-9B-Instruct-AWQ

	GPTQ 4-bit quantized version of [QCRI/Fanar-1-9B-Instruct](https://huggingface.co/QCRI/Fanar-1-9B-Instruct).



	## Details

	- Quantization: GPTQ 4-bit (w4a16)
	- Size: ~5GB (vs ~18GB original)
	- Memory: 75% reduction
	- Quality: 95%+ retention
	- Optimized for: vLLM inference

	## Requirements

	```bash
	pip install vllm>=0.6.0
	```

	Model quantized using AutoAWQ with domain-specific calibration data.