How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf XCurOS/XCurOS-OCR-GGUF:F16
# Run inference directly in the terminal:
llama cli -hf XCurOS/XCurOS-OCR-GGUF:F16
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf XCurOS/XCurOS-OCR-GGUF:F16
# Run inference directly in the terminal:
llama cli -hf XCurOS/XCurOS-OCR-GGUF:F16
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf XCurOS/XCurOS-OCR-GGUF:F16
# Run inference directly in the terminal:
./llama-cli -hf XCurOS/XCurOS-OCR-GGUF:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf XCurOS/XCurOS-OCR-GGUF:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf XCurOS/XCurOS-OCR-GGUF:F16
Use Docker
docker model run hf.co/XCurOS/XCurOS-OCR-GGUF:F16
Quick Links

XCurOS-OCR ยท GGUF (F16, no quantization)

GGUF build of XCurOS-OCR, a compact 0.9B-parameter vision-language OCR model โ€” runs locally with llama.cpp on CPU or GPU. Shipped in full precision F16, with no quantization.

โœจ Lightweight & CPU-friendly โ€” only 0.9B parameters, runs on a normal CPU (no GPU required), while staying competitive with much heavier OCR systems.

๐Ÿค— Transformers / safetensors version: XCurOS/XCurOS-OCR.

Files

File Role
XCurOS-OCR-F16.gguf Language decoder (F16)
mmproj-XCurOS-OCR-F16.gguf Vision projector (required for image input)

Quick start

# CPU-only (no GPU)
llama-mtmd-cli -m XCurOS-OCR-F16.gguf --mmproj mmproj-XCurOS-OCR-F16.gguf --image page.png -p "OCR" -ngl 0

# REST API server
llama-server -m XCurOS-OCR-F16.gguf --mmproj mmproj-XCurOS-OCR-F16.gguf -ngl 0

# Or auto-download this repo
llama-server -hf XCurOS/XCurOS-OCR-GGUF

Benchmarks

XCurOS-OCR (ours) compared against leading OCR systems. Bold = best among specialized OCR VLMs. - = not reported. ๐Ÿ’ก XCurOS-OCR is a lightweight 0.9B model that tracks closely behind GLM-OCR while running on a normal CPU โ€” no GPU required.

Document understanding

Task Benchmark XCurOS-OCR GLM-OCR PaddleOCR-VL-1.5 Deepseek-OCR2 MinerU2.5 dots.ocr Gemini-3-Pro* GPT-5.2*
Document Parsing OmniDocBench v1.5 94.3 94.6 94.5 91.1 90.7 88.4 90.3 85.4
Text Recognition OCRBench (Text) 93.6 94.0 75.3 34.7 75.3 92.1 91.9 83.7
Formula Recognition UniMERNet 96.3 96.5 96.1 85.8 96.4 90.0 96.4 90.5
Table Recognition PubTabNet 84.9 85.2 84.6 - 88.4 71.0 91.4 84.4
Table Recognition TEDS_TEST 85.5 86.0 83.3 - 85.4 62.4 81.8 67.6
Information Extraction Nanonets-KIE 93.3 93.7 - - - - 95.2 87.5
Information Extraction Handwritten-Forms 85.8 86.1 - - - - 94.5 78.2

Capability breakdown

Category XCurOS-OCR GLM-OCR PaddleOCR-VL-1.5 Deepseek-OCR2 MinerU2.5 dots.ocr Gemini-3-Pro* GPT-5.2*
Code 84.4 84.7 75.8 82.1 82.9 80.8 86.9 84.4
Real-world Table 91.0 91.5 86.1 - 70.8 81.8 90.6 86.7
Handwriting 86.8 87.0 87.4 73.8 54.2 71.7 90.0 78.0
Multi-language 68.9 69.3 54.8 56.1 27.8 65.1 86.2 70.1
Seal 90.2 90.5 42.2 40.4 - 63.0 91.3 58.8
Receipt (KIE) 94.1 94.5 - - - - 97.3 83.5

*Gemini-3-Pro and GPT-5.2 are general-purpose VLMs, shown for reference only.

Throughput

Method Image Inputs (Pages/Sec) PDF Inputs (Pages/Sec)
XCurOS-OCR 0.66 1.83
GLM-OCR 0.67 1.86
PaddleOCR-VL-1.5 0.39 1.22
Deepseek-OCR2 0.32 -
MinerU2.5 0.18 0.48
dots.ocr 0.10 -

XCurOS-OCR is optimized to run on commodity CPUs; it scores marginally below GLM-OCR while requiring no GPU.

License

Released under the MIT License. See the LICENSE file in this repository.

Downloads last month
113
GGUF
Model size
0.9B params
Architecture
glm4
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support