Instructions to use AyoubChLin/deepseek_ocr2_arabic_jsonify with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AyoubChLin/deepseek_ocr2_arabic_jsonify with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="AyoubChLin/deepseek_ocr2_arabic_jsonify", trust_remote_code=True)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("AyoubChLin/deepseek_ocr2_arabic_jsonify", trust_remote_code=True, dtype="auto")

PEFT
How to use AyoubChLin/deepseek_ocr2_arabic_jsonify with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use AyoubChLin/deepseek_ocr2_arabic_jsonify with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AyoubChLin/deepseek_ocr2_arabic_jsonify"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AyoubChLin/deepseek_ocr2_arabic_jsonify",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/AyoubChLin/deepseek_ocr2_arabic_jsonify

SGLang

How to use AyoubChLin/deepseek_ocr2_arabic_jsonify with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AyoubChLin/deepseek_ocr2_arabic_jsonify" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AyoubChLin/deepseek_ocr2_arabic_jsonify",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AyoubChLin/deepseek_ocr2_arabic_jsonify" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AyoubChLin/deepseek_ocr2_arabic_jsonify",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Unsloth Studio

How to use AyoubChLin/deepseek_ocr2_arabic_jsonify with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for AyoubChLin/deepseek_ocr2_arabic_jsonify to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for AyoubChLin/deepseek_ocr2_arabic_jsonify to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for AyoubChLin/deepseek_ocr2_arabic_jsonify to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="AyoubChLin/deepseek_ocr2_arabic_jsonify",
    max_seq_length=2048,
)

Docker Model Runner
How to use AyoubChLin/deepseek_ocr2_arabic_jsonify with Docker Model Runner:
```
docker model run hf.co/AyoubChLin/deepseek_ocr2_arabic_jsonify
```

deepseek_ocr2_arabic_jsonify

deepseek_ocr2_arabic_jsonify is a task-specific fine-tune of deepseek-ai/DeepSeek-OCR-2 for OCR-to-JSON extraction on building regulation pages. It is trained to read a single document page image and return one strict JSON object containing the page header fields and regulation table fields without extra explanation text.

The training workflow in the notebook loads the Unsloth-compatible unsloth/DeepSeek-OCR-2 checkpoint, which maps to the same DeepSeek OCR 2 base model family, then fine-tunes it with LoRA for structured extraction.

Intended use

Extract structured data from scanned or photographed building regulation pages.
Return JSON only.
Preserve the original document language and values exactly when possible, especially Arabic text, numbers, punctuation, and line breaks.
Use empty strings for missing or unreadable fields instead of hallucinating values.

Output schema

The model was trained to produce this exact JSON structure and key order:

{
  "header": {
    "municipality": "",
    "district_name": "",
    "plan_number": "",
    "plot_number": "",
    "block_number": "",
    "division_area": ""
  },
  "table": {
    "building_regulations": "",
    "building_usage": "",
    "setback": "",
    "heights": "",
    "building_factor": "",
    "building_ratio": "",
    "parking_requirements": "",
    "notes": ""
  }
}

Prompt format

The notebook converts each sample into a 3-message conversation:

[
  {
    "role": "<|System|>",
    "content": "Extract only the header and table fields and return one valid JSON object."
  },
  {
    "role": "<|User|>",
    "content": "<image>\n.",
    "images": ["document-page-image"]
  },
  {
    "role": "<|Assistant|>",
    "content": "{...gold JSON...}"
  }
]

The system instruction also enforces JSON-only output, original-language preservation, no extra keys, and empty-string fallback for missing fields.

Training data

Custom dataset of 108 document-page images paired with gold JSON extraction targets.
Domain: Riyadh municipal building regulation pages.
Source format: local data.jsonl with fields image, text, transformed_text_to_json, and transformed_text_to_json_translated_to_English.
Training target: the text field, which contains the expected JSON output.

Training details

Base model: deepseek-ai/DeepSeek-OCR-2
Fine-tuning framework: Unsloth with Hugging Face Transformers/TRL
Hardware used for the recorded run: NVIDIA A100-SXM4-40GB
Image settings: image_size=1024, base_size=1024, crop_mode=True
LoRA target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
LoRA config: r=32, lora_alpha=64, lora_dropout=0
Precision: bf16 when supported
Per-device batch size: 2
Gradient accumulation steps: 4
Effective batch size: 8
Learning rate: 2e-4
Optimizer: adamw_8bit
LR scheduler: linear
Epochs in the recorded run: 8
Actual training steps in the recorded run: 112
Train on responses only: True
Trainable parameters: 172,615,680 / 3,561,735,040 (4.85%)
Training runtime: 1548.2392 seconds (25.8 minutes)
Peak reserved GPU memory: 39.686 GB
Peak reserved GPU memory attributed to training: 29.706 GB

Evaluation notes

The notebook reports baseline DeepSeek-OCR-2 performance of 23% character error rate on one sample before fine-tuning.
The recorded notebook run does not include a held-out validation or test benchmark after fine-tuning.
Training loss decreased from 1.4462 at step 1 to 0.0281 at step 112.

Limitations

This model is specialized for building regulation pages and may not transfer well to other document layouts or jurisdictions.
The model is optimized for a fixed JSON schema, not general-purpose OCR or document QA.
No separate evaluation split is documented in the notebook, so real-world accuracy should be validated on your own samples before deployment.
Errors are more likely on low-quality scans, heavily rotated pages, partially cropped pages, handwriting, or unseen form variants.

Repository notes

The notebook saved the model under AyoubChLin/deepseek_ocr2_arabic_jsonify.
The Hub repository currently contains both adapter artifacts and merged model weights produced by the notebook save workflow.

Acknowledgements

Base model: deepseek-ai/DeepSeek-OCR-2
Fine-tuning workflow: Unsloth

Downloads last month: 3

Safetensors

Model size

3B params

Tensor type

BF16

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AyoubChLin/deepseek_ocr2_arabic_jsonify

Base model

deepseek-ai/DeepSeek-OCR-2

Adapter

(8)

this model