--- license: apache-2.0 language: - fa pipeline_tag: image-to-text widget: - src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/papers/attention.png example_title: "Persian OCR" --- # Persian-OCR **Persian-OCR** is a deep learning model for **Optical Character Recognition (OCR)**, designed specifically for Persian text. The model employs a **CNN + Transformer architecture** trained with **CTC loss** to extract text from images. The model was trained on a custom dataset of approximately **600,000 synthetic Persian text images**. These images were generated from **Wikipedia text** using **49 different Persian fonts**, with sequence lengths ranging from **0 to 150 characters**. On this dataset, the model achieves a **sequence accuracy of 96%**. The model may benefit from **further fine-tuning on real-world data**, and contributions or collaborations are **warmly welcomed**. ## 🤝 Contributing Contributions are welcome! If you have a dataset of real-world Persian text or improvements to the model, please open an issue or submit a pull request. ## 📬 Contact For collaboration or inquiries, please reach out via farbodpya@gmail.com ## Files - `pytorch_model.bin` : PyTorch model weights - `vocab.json` : Character vocabulary - `model.py` : Python script defining the CNN + Transformer OCR model - `utils.py` : Utility functions for OCR, including `ocr_page` and `load_vocab` - `config.json` : Model configuration ## Installation ``` pip install torch torchvision huggingface_hub ``` ## Usage ``` import torch import json import sys import importlib.util from huggingface_hub import hf_hub_download # 1️⃣ Load vocab vocab_path = hf_hub_download("farbodpya/Persian-OCR", "vocab.json") with open(vocab_path, "r", encoding="utf-8") as f: vocab = json.load(f) idx_to_char = {int(k): v for k, v in vocab["idx_to_char"].items()} # 2️⃣ Import model.py model_file = hf_hub_download("farbodpya/Persian-OCR", "model.py") spec_model = importlib.util.spec_from_file_location("model", model_file) model_module = importlib.util.module_from_spec(spec_model) sys.modules["model"] = model_module spec_model.loader.exec_module(model_module) from model import CNN_Transformer_OCR # 3️⃣ Import utils.py utils_file = hf_hub_download("farbodpya/Persian-OCR", "utils.py") spec_utils = importlib.util.spec_from_file_location("utils", utils_file) utils_module = importlib.util.module_from_spec(spec_utils) sys.modules["utils"] = utils_module spec_utils.loader.exec_module(utils_module) from utils import ocr_page # 4️⃣ Load model weights weights_path = hf_hub_download("farbodpya/Persian-OCR", "pytorch_model.bin") model = CNN_Transformer_OCR(num_classes=len(idx_to_char)+1) model.load_state_dict(torch.load(weights_path, map_location="cpu")) model.eval() # 5️⃣ Run OCR on an image img_path = "sample.png" # replace with your own image text = ocr_page(img_path, model, idx_to_char) print("\n=== Final OCR Page ===\n", text)