FireRed-OCR โ€” CrispEmbed GGUF

GGUF conversion of FireRedTeam/FireRed-OCR for CrispEmbed.

FireRed-OCR is a Qwen3-VL 2B fine-tune with Format-Constrained GRPO training for tables, formulas, and structured document OCR. 92.94% on OmniDocBench v1.5.

Architecture

  • Vision: Qwen3-VL ViT (24 layers, 1024d, patch 16, deepstack [5,11,17])
  • LLM: Qwen3-2B (28 layers, 2048d, GQA 16/8, QK norms, mRoPE interleaved)
  • Training: Format-Constrained GRPO with rewards for formula syntax, table integrity, hierarchical closure

Models

File Quant Size Notes
firered-ocr-f16.gguf F16 4.0 GB Full precision
firered-ocr-q8_0.gguf Q8_0 2.2 GB Best quality/size
firered-ocr-q4_k.gguf Q4_K 1.6 GB Smallest
firered-ocr-ref.gguf F32 17 MB Reference activations for parity testing

Usage

crispembed --ocr firered-ocr-q8_0.gguf document.png

Runs on the qwen2vl_ocr engine (auto-detected from GGUF architecture qwen3vl).

License

Apache-2.0 (FireRedTeam/FireRed-OCR)

Downloads last month
238
GGUF
Model size
2B params
Architecture
qwen3vl
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cstr/firered-ocr-crispembed-GGUF

Quantized
(5)
this model