Instructions to use Keyven/german-ocr-3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Keyven/german-ocr-3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Keyven/german-ocr-3")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Keyven/german-ocr-3", dtype="auto")

llama-cpp-python

How to use Keyven/german-ocr-3 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Keyven/german-ocr-3",
	filename="german-ocr-3-Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": [
				{
					"type": "text",
					"text": "Describe this image in one sentence."
				},
				{
					"type": "image_url",
					"image_url": {
						"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
					}
				}
			]
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Keyven/german-ocr-3 with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf Keyven/german-ocr-3:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf Keyven/german-ocr-3:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf Keyven/german-ocr-3:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf Keyven/german-ocr-3:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Keyven/german-ocr-3:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Keyven/german-ocr-3:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Keyven/german-ocr-3:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Keyven/german-ocr-3:Q4_K_M

Use Docker

docker model run hf.co/Keyven/german-ocr-3:Q4_K_M

LM Studio
Jan

vLLM

How to use Keyven/german-ocr-3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Keyven/german-ocr-3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Keyven/german-ocr-3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Keyven/german-ocr-3:Q4_K_M

SGLang

How to use Keyven/german-ocr-3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Keyven/german-ocr-3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Keyven/german-ocr-3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Keyven/german-ocr-3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Keyven/german-ocr-3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Ollama
How to use Keyven/german-ocr-3 with Ollama:
```
ollama run hf.co/Keyven/german-ocr-3:Q4_K_M
```

Unsloth Studio

How to use Keyven/german-ocr-3 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Keyven/german-ocr-3 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Keyven/german-ocr-3 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Keyven/german-ocr-3 to start chatting

How to use Keyven/german-ocr-3 with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf Keyven/german-ocr-3:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Keyven/german-ocr-3:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Keyven/german-ocr-3 with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf Keyven/german-ocr-3:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Keyven/german-ocr-3:Q4_K_M

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use Keyven/german-ocr-3 with Docker Model Runner:
```
docker model run hf.co/Keyven/german-ocr-3:Q4_K_M
```

Lemonade

How to use Keyven/german-ocr-3 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Keyven/german-ocr-3:Q4_K_M

Run and chat with the model

lemonade run user.german-ocr-3-Q4_K_M

List all available models

lemonade list

german-ocr-3 / README.md

Keyven

Update README.md

0f176b0 verified 2 months ago

preview code

Raw

History Blame Contribute Delete

9.47 kB

	---
	language:
	- de
	- en
	- fr
	- es
	- ar
	- fa
	- it
	- sv
	- ru
	- zh
	license: apache-2.0
	library_name: transformers
	pipeline_tag: image-text-to-text
	tags:
	- german
	- deutsch
	- ocr
	- vision
	- document-ai
	- invoice
	- rechnung
	- structured-extraction
	- json-extraction
	- kie
	- ollama
	- vllm
	- llama-cpp
	- apache-2.0
	inference: true
	datasets:
	- neuralabs/german-synth-ocr
	- Aoschu/German_invoices_dataset_for_donut
	base_model:
	- Qwen/Qwen3.5-2B
	new_version: Keyven/german-ocr-3
	---

	<p align="center">
	<img src="https://app.german-ocr.de/icon.png" alt="German-OCR-3" width="140" height="140" />
	</p>

	<h1 align="center">German-OCR-3</h1>

	<p align="center"><strong>Deutsche Vision-OCR. Kompakt. Lokal. Open Source.</strong><br/>
	<sub>Aus deutschem Dokument-Bild → strikt validiertes JSON. In unter 60 Sekunden lokal lauffähig.</sub></p>

	<p align="center">
	<a href="https://german-ocr.de"><img alt="Site" src="https://img.shields.io/badge/site-german--ocr.de-3B82F6?style=flat-square&labelColor=0B1220"/></a>
	<a href="https://ollama.com/Keyvan/german-ocr-3"><img alt="Ollama" src="https://img.shields.io/badge/Ollama-Keyvan%2Fgerman--ocr--3-F59E0B?style=flat-square&labelColor=0B1220"/></a>
	<a href="https://github.com/Keyvanhardani/German-OCR-3-Dev"><img alt="GitHub" src="https://img.shields.io/badge/GitHub-source-181717?style=flat-square&labelColor=0B1220"/></a>
	<a href="#license"><img alt="License: Apache 2.0" src="https://img.shields.io/badge/License-Apache_2.0-22C55E?style=flat-square&labelColor=0B1220"/></a>
	<img alt="Language" src="https://img.shields.io/badge/lang-Deutsch-3B82F6?style=flat-square&labelColor=0B1220"/>
	<img alt="Hallucination" src="https://img.shields.io/badge/Halluzination-0%25-22C55E?style=flat-square&labelColor=0B1220"/>
	</p>

	---

	## ⚡ At a glance

	<table align="center">
	<tr>
	<td align="center" width="180"><h2>100 %</h2><sub>Gültiges JSON</sub></td>
	<td align="center" width="180"><h2>95 %</h2><sub>Sender korrekt</sub></td>
	<td align="center" width="180"><h2>0 %</h2><sub>Halluzination</sub></td>
	<td align="center" width="180"><h2>5.0 s</h2><sub>Latenz / Doc</sub></td>
	</tr>
	</table>

	<p align="center"><sub>Auf <strong>200+ echten anonymisierten deutschen Rechnungen</strong> (Default-Edition, 2.7 GB)</sub></p>

	---

	## Was ist German-OCR-3?

	German-OCR-3 ist eine kompakte, schnelle und voll lokal lauffähige Vision-OCR-Distribution für deutsche Geschäftsdokumente — Rechnungen, Briefe, Formulare, Quittungen, Bescheide. Aus dem Bild kommt strikt validiertes JSON nach unserem deutschen Extraktions-Schema. Ohne Cloud-Pflicht, ohne Vendor-Lock-in.

	Zwei Editionen, beide Apache 2.0, beide unter 3 GB:

	\| Edition \| Ollama \| Größe \| Zielhardware \| Stärke \|
	\|---\|---\|---:\|---\|---\|
	\| Nano \| `Keyvan/german-ocr-nano` \| 1.0 GB \| CPU · Edge · Mobile \| „läuft überall" \|
	\| Default ⭐ \| `Keyvan/german-ocr-3` \| 2.7 GB \| 4–6 GB VRAM \| beste Field-Erkennung \|

	> Fine-tuned adapter für deutsche Geschäftsdokument-Extraktion. Apache 2.0.

	---

	## 📊 Praxistest — 200+ echte deutsche Rechnungen (anonymisiert)

	<p align="center">
	<img src="https://huggingface.co/Keyven/german-ocr-3/resolve/main/charts/02_ionos_validity.png" alt="Praxistest" width="820"/>
	</p>

	\| Edition \| Valid JSON \| Sender korrekt \| Halluzination \| Latenz \|
	\|---\|---:\|---:\|---:\|---:\|
	\| `Keyvan/german-ocr-nano` \| 84 % \| 79 % \| 0 % \| 6.6 s \|
	\| `Keyvan/german-ocr-3` ⭐ \| 100 % \| 95 % \| 0 % \| 5.0 s \|

	Keine "Mustermann"-Defaults. German-OCR-3 liest echte Firma, Kundenadresse, Produkte, Beträge — statt zu raten.

	---

	## 📐 Größenvergleich

	<p align="center">
	<img src="https://huggingface.co/Keyven/german-ocr-3/resolve/main/charts/01_size_vs_competitors.png" alt="Modellgrößen" width="820"/>
	</p>

	`german-ocr-3` (2.7 GB) ist 6× kleiner als ein typischer 7B-OCR-VLM. Läuft auf einer 8 GB-Gaming-GPU oder über CPU auf einem normalen Laptop.

	<p align="center">
	<img src="https://huggingface.co/Keyven/german-ocr-3/resolve/main/charts/04_latency.png" alt="Latenz" width="620"/>
	</p>

	---

	## 🚀 Quickstart

	### Ollama (empfohlen, eine Zeile)

	```bash
	ollama pull Keyvan/german-ocr-3
	ollama run Keyvan/german-ocr-3 "Extrahiere die Rechnung im Bild als JSON." ./meine_rechnung.png
	```

	<details>
	<summary><b>Beispiel-Output (anonymisiert, aus Praxistest)</b> — klicken zum Aufklappen</summary>

	```json
	{
	"document_type": "invoice",
	"language": "de",
	"invoice_number": "100137xXXXXX",
	"invoice_date": "2024-01-22",
	"due_date": "2024-01-27",
	"sender": {
	"name": "IONOS SE",
	"address": "Elgendorfer Str. 57, 56410 Montabaur",
	"vat_id": "DE81556XXX",
	"iban": null
	},
	"recipient": {
	"name": "Firma e.K.",
	"address": "Muster Straße 32, 80335 München",
	"customer_id": "5835XXX"
	},
	"line_items": [
	{"position": 1, "description": "Mail Business 1 Liz.", "quantity": 1,
	"unit": "Monat", "unit_price_net": 4.20, "amount_net": 4.20, "vat_rate": 19}
	],
	"amount_net": 4.20,
	"amount_vat": 0.80,
	"amount_total": 5.00,
	"currency": "EUR",
	"notes": ["Entsprechend Ihrem SEPA-Lastschriftmandat ..."]
	}
	```

	</details>

	### Python (via Ollama HTTP API)

	```python
	import base64, json, requests
	from pathlib import Path

	b64 = base64.b64encode(Path("rechnung.png").read_bytes()).decode()
	resp = requests.post("http://localhost:11434/api/generate", json={
	"model": "Keyvan/german-ocr-3",
	"prompt": "Extrahiere die Rechnung im Bild als JSON.",
	"images": [b64],
	"stream": False,
	"options": {"temperature": 0, "num_ctx": 32768},
	})
	data = json.loads(resp.json()["response"])
	print(json.dumps(data, indent=2, ensure_ascii=False))
	```

	### Bundle herunterladen

	```bash
	huggingface-cli download Keyven/german-ocr-3 --local-dir ./german-ocr-3
	# Enthält: Modelfile · JSON-Schemas · System-Prompt · GGUF-Quants · Charts
	```

	### llama.cpp (GGUF direkt)

	```bash
	llama-cli -m ./german-ocr-3/german-ocr-3-Q4_K_M.gguf \
	--system-prompt-file ./german-ocr-3/system_prompt.txt \
	-p "Extrahiere die Rechnung als JSON:" --temp 0
	```

	---

	## 📚 Trainings- und Evaluations-Datensätze

	\| Datensatz \| Umfang \| Typ \|
	\|---\|---\|---\|
	\| [`neuralabs/german-synth-ocr`](https://huggingface.co/datasets/neuralabs/german-synth-ocr) \| 4 500+ \| Deutsche OCR-Samples (synthetisch, Apache-2.0) \|
	\| [`Aoschu/German_invoices_dataset_for_donut`](https://huggingface.co/datasets/Aoschu/German_invoices_dataset_for_donut) \| 129 \| Echte deutsche Rechnungen (Donut-Format) \|
	\| Eigenes synthetisches DE-Rechnungs-Set \| 100 \| Rechnungen mit Golden-JSON, deterministisch generiert \|
	\| Anonymisierter DACH-Praxistest \| 200+ \| Echte Rechnungen verschiedener DACH-Anbieter (intern, DSGVO) \|

	---

	## 🎯 Zielgruppen

	- Solo-Builder & Indies — deutsche Dokumente lokal extrahieren, ohne Cloud-OCR-Kosten.
	- DACH-KMU mit Datenschutz-Anspruch — lokal / on-prem hosten.
	- Agenturen & Studios — Open-Source-Fundament unter der eigenen Pipeline.

	Wer es gemanagt und mit noch größeren Modellen will:

	> 🌐 [german-ocr.de](https://german-ocr.de) — gehostete deutsche OCR-API mit Premium-Modellen, höherer Genauigkeit, ohne eigene Hardware. Daten bleiben in der EU.

	---

	## ⚠️ Limitations

	- Optimiert für deutsche Dokumente — andere Sprachen keine Garantie.
	- Beste Qualität bei klaren, hochauflösenden Scans/Fotos.
	- Handschriftliche Dokumente: nur begrenzt.
	- Bei kritischen Vorgängen (Buchhaltung, Recht): immer Human-in-the-Loop.

	---

	## 🙏 Credit & Attribution

	German-OCR-3 baut auf der hervorragenden Arbeit des Qwen-Teams bei Alibaba Group auf. Die zugrundeliegende Vision-Language-Architektur stammt aus der Qwen 3.5 Small Series, veröffentlicht unter Apache License 2.0. Ohne die offene Forschung und die saubere Veröffentlichung der Qwen-Weights wäre dieses Projekt nicht möglich.

	- Qwen 3.5 — https://huggingface.co/Qwen · https://qwen.ai
	- Apache License 2.0 (Weights) — © 2025–2026 Qwen Team, Alibaba Group
	- Qwen2.5-VL Technical Report — [arXiv:2502.13923](https://arxiv.org/abs/2502.13923)

	Vollständiger Attribution-Text in [`NOTICE`](NOTICE).

	---

	## <a id="license"></a>📄 License

	Apache License 2.0 für die gesamte German-OCR-3-Distribution (Modelfiles, System-Prompt, Schemas, Docs, GGUFs).

	---

	## 📑 Citation

	Wenn du German-OCR-3 in Forschung oder Produktion verwendest, zitiere bitte beides — unsere Distribution und die Qwen-Basisarbeit:

	```bibtex
	@misc{german_ocr_3_2026,
	title = {German-OCR-3: A compact German document-OCR distribution},
	author = {Hardani, Keyvan},
	year = {2026},
	url = {https://github.com/Keyvanhardani/German-OCR}
	}

	@misc{qwen35_2026,
	title = {Qwen 3.5 Small Series},
	author = {{Qwen Team, Alibaba Group}},
	year = {2026},
	howpublished = {\url{https://huggingface.co/Qwen}},
	note = {Apache License 2.0}
	}

	@article{qwen25vl_2025,
	title = {Qwen2.5-VL Technical Report},
	author = {{Qwen Team, Alibaba Group}},
	journal = {arXiv preprint arXiv:2502.13923},
	year = {2025}
	}
	```

	---

	## 👤 Author

	Keyvan Hardani
	· Website: [keyvan.ai](https://keyvan.ai)
	· LinkedIn: [linkedin.com/in/keyvanhardani](https://linkedin.com/in/keyvanhardani)
	· GitHub: [@Keyvanhardani](https://github.com/Keyvanhardani)
	· Hosted Premium: [german-ocr.de](https://german-ocr.de)