Qari-OCR CrispEmbed GGUF

GGUF conversion of NAMAA-Space/Qari-OCR-0.2.2.1-VL-2B-Instruct (Apache-2.0) for use with CrispEmbed.

Model

Arabic OCR with full diacritics (tashkeel) support. Fine-tuned from Qwen2-VL-2B-Instruct via LoRA (r=16, alpha=16, 324 adapter pairs) on 50K Arabic OCR samples.

  • Architecture: Qwen2-VL-2B (32L ViT 1280d + spatial merger + 28L Qwen2 LLM 1536d, GQA 12/2)
  • Parameters: 2B
  • Performance: WER=0.221, CER=0.059, BLEU=0.597
  • Training: 50K Arabic OCR records, 1 epoch, LoRA on attention+MLP

Files

File Type Size
qari-ocr-2b-f16.gguf F16 4.7 GB
qari-ocr-2b-q8_0.gguf Q8_0 2.3 GB
qari-ocr-2b-q4_k.gguf Q4_K 1.6 GB

Usage

Uses the same qwen2vl engine as other Qwen2-VL models in CrispEmbed.

Conversion

LoRA adapter merged into full-precision Qwen2-VL-2B-Instruct base weights (324 pairs, tensor-by-tensor), then converted to GGUF via CrispEmbed converter. Quantized with crispembed-quantize (vision weights at Q8_0 floor).

License

Apache-2.0 (NAMAA-Space/Qari-OCR).

Downloads last month
910
GGUF
Model size
2B params
Architecture
qwen2vl
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cstr/qari-ocr-crispembed-GGUF