GLM-OCR Fine-tuned — Documents arabes & français
Modèle GLM-OCR fine-tuné avec LoRA sur ~5 466 documents annotés manuellement.
Données d'entraînement
- 4 704 documents arabes (annotations Supervisely manuelles)
- 113 documents latin (français/anglais)
- 800 documents anglais scannés (pseudo-labels)
- Types : Administrative form, Invoice, Receipt, Newspaper, Official document, Handwritten text...
Entraînement
- Modèle de base : zai-org/GLM-OCR
- Méthode : LoRA continu (rank=16, alpha=32)
- Epochs : 3 — Learning rate : 1e-4
- GPU : NVIDIA RTX 4050 Laptop (6 Go VRAM)
Usage
from transformers import AutoProcessor, GlmOcrForConditionalGeneration
import torch
from PIL import Image
model_id = "maloukafer/GLM-OCR-finetuned-documents"
processor = AutoProcessor.from_pretrained(model_id)
model = GlmOcrForConditionalGeneration.from_pretrained(
model_id, dtype=torch.bfloat16, device_map="auto"
)
model.eval()
image = Image.open("document.jpg").convert("RGB")
messages = [{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": "Document Parsing:"}
]}]
text_input = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = processor(text=text_input, images=[image], return_tensors="pt").to("cuda")
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
n = inputs["input_ids"].shape[1]
text = processor.decode(out[0][n:], skip_special_tokens=True).strip()
print(text)
- Downloads last month
- 74
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for maloukafer/GLM-OCR-finetuned-documents
Base model
zai-org/GLM-OCR