Instructions to use hastyle/olmOCR-arabic-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use hastyle/olmOCR-arabic-lora with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("allenai/olmOCR-2-7B-1025")
model = PeftModel.from_pretrained(base_model, "hastyle/olmOCR-arabic-lora")

Transformers

How to use hastyle/olmOCR-arabic-lora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="hastyle/olmOCR-arabic-lora")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("hastyle/olmOCR-arabic-lora", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use hastyle/olmOCR-arabic-lora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "hastyle/olmOCR-arabic-lora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "hastyle/olmOCR-arabic-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/hastyle/olmOCR-arabic-lora

SGLang

How to use hastyle/olmOCR-arabic-lora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "hastyle/olmOCR-arabic-lora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "hastyle/olmOCR-arabic-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "hastyle/olmOCR-arabic-lora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "hastyle/olmOCR-arabic-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use hastyle/olmOCR-arabic-lora with Docker Model Runner:
```
docker model run hf.co/hastyle/olmOCR-arabic-lora
```

olmOCR Arabic LoRA Adapter

A LoRA (Low-Rank Adaptation) fine-tuned adapter for Arabic OCR, built on top of allenai/olmOCR-2-7B-1025.

Model Description

This adapter enhances olmOCR's ability to recognize Arabic text in documents, including:

Handwritten Arabic text
Printed Arabic documents
Mixed Arabic/English documents

Training Details

Parameter	Value
Base Model	allenai/olmOCR-2-7B-1025
LoRA Rank (r)	16
LoRA Alpha	32
LoRA Dropout	0.05
Training Samples	450,044
Epochs	3
Learning Rate	2e-5
Batch Size	64 (effective)
Hardware	8x NVIDIA A100 80GB
Training Time	~36 hours
Trainable Parameters	47.6M (0.57% of total)

Target Modules

q_proj, k_proj, v_proj, o_proj (attention)
gate_proj, up_proj, down_proj (FFN)

Usage

Installation

pip install transformers peft torch

Load the Model

from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
from peft import PeftModel
import torch

# Load base model
base_model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "allenai/olmOCR-2-7B-1025",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "hastyle/olmOCR-arabic-lora")

# Optional: Merge for faster inference
model = model.merge_and_unload()

# Load processor
processor = AutoProcessor.from_pretrained("allenai/olmOCR-2-7B-1025", trust_remote_code=True)

Run Inference

from PIL import Image

# Load your Arabic document image
image = Image.open("arabic_document.png")

# Create prompt (olmOCR format)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": "Extract the text from this document."},
        ],
    }
]

# Process and generate
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], return_tensors="pt", padding=True)
inputs = {k: v.to(model.device) for k, v in inputs.items()}

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=2048, do_sample=False)

# Decode output
result = processor.batch_decode(outputs[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)[0]
print(result)

Training Data

The model was fine-tuned on a combined dataset of Arabic OCR samples including:

Arabic handwritten documents
Printed Arabic text
Mixed-script documents

Total training samples: 450,044

Evaluation

Results (Single-Word Arabic OCR Test Set)

Model	Samples	Corpus WER	Corpus CER	Throughput
Baseline (olmOCR-2-7B)	100	252.00%	184.53%	0.56 img/s
This Adapter	100	0.00%	0.00%	0.58 img/s

Key Findings

Dramatic improvement: Reduces WER from 252% to 0% on Arabic text
No speed penalty: Inference throughput remains comparable to baseline
Stable training: All checkpoints from steps 19500-21000 achieve identical 0% WER

The baseline model exhibits severe hallucination on Arabic text, often generating English or nonsense output. This LoRA adapter corrects this behavior entirely on the test set.

Limitations

Optimized primarily for Arabic script
Performance may vary on extremely degraded or low-quality scans
Works best with documents at 150+ DPI

Citation

If you use this model, please cite:

@misc{olmocr-arabic-lora,
  title={olmOCR Arabic LoRA Adapter},
  author={Allen Institute for AI},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/hastyle/olmOCR-arabic-lora}
}

License

Apache 2.0

Framework Versions

PEFT: 0.18.0
Transformers: 4.47+
PyTorch: 2.0+

Downloads last month: 2

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hastyle/olmOCR-arabic-lora

Base model

Qwen/Qwen2.5-VL-7B-Instruct

Finetuned

allenai/olmOCR-2-7B-1025

Adapter

(6)

this model