Kabyle OCR Model – PaddleOCR Checkpoint

This is a text recognition model for the Kabyle language (written in Latin script), trained using PaddleOCR (PP‑OCRv3 architecture). The model was trained on synthetic text generated from Kabyle news corpora.

Model Details

Property	Value
Architecture	PP‑OCRv3 (CRNN)
Character set size	109 Kabyle characters + 1 blank token
Image shape	3×48×480 (height=48, width=480)
Max text length	25 characters
Training data	18,000 synthetic images (mini‑test)
Evaluation accuracy	57% (on held‑out validation set)
Normalised edit distance	0.96

The character set includes both basic Latin letters and Kabyle‑specific characters:
č, ḍ, ɛ, ǧ, ɣ, ḥ, ṛ, ṣ, ṭ, ẓ (and their uppercase variants).

Files in this repository

best_accuracy.pdparams – Trained model weights (PaddlePaddle format)
kab_dict.txt – Character dictionary (one character per line)
config.yml – Full training configuration (including image shape, transforms, etc.)
inference.yml – Inference settings (optional, used by some scripts)

How to Use the Model

This is a test. Do not use it in production environnement.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support