WeightedAI
/

Persian_OCR

Model card Files Files and versions

Persian_OCR / README.md

farbodpya's picture

Update README.md

bd63f64 verified 3 months ago

|

history blame contribute delete

3.03 kB

	---
	license: apache-2.0
	language:
	- fa
	pipeline_tag: image-to-text
	widget:
	- src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/papers/attention.png
	example_title: "Persian OCR"
	---

	# Persian-OCR

	Persian-OCR is a deep learning model for Optical Character Recognition (OCR), designed specifically for Persian text.
	The model employs a CNN + Transformer architecture trained with CTC loss to extract text from images.

	The model was trained on a custom dataset of approximately 600,000 synthetic Persian text images.
	These images were generated from Wikipedia text using 49 different Persian fonts, with sequence lengths ranging from 0 to 150 characters.

	On this dataset, the model achieves a sequence accuracy of 96%.

	The model may benefit from further fine-tuning on real-world data, and contributions or collaborations are warmly welcomed.

	## 🤝 Contributing
	Contributions are welcome! If you have a dataset of real-world Persian text or improvements to the model, please open an issue or submit a pull request.




	## 📬 Contact
	For collaboration or inquiries, please reach out via farbodpya@gmail.com




	## Files

	- `pytorch_model.bin` : PyTorch model weights
	- `vocab.json` : Character vocabulary
	- `model.py` : Python script defining the CNN + Transformer OCR model
	- `utils.py` : Utility functions for OCR, including `ocr_page` and `load_vocab`
	- `config.json` : Model configuration

	## Installation
	```
	pip install torch torchvision huggingface_hub
	```


	## Usage
	```

	import torch
	import json
	import sys
	import importlib.util
	from huggingface_hub import hf_hub_download

	# 1️⃣ Load vocab
	vocab_path = hf_hub_download("farbodpya/Persian-OCR", "vocab.json")
	with open(vocab_path, "r", encoding="utf-8") as f:
	vocab = json.load(f)
	idx_to_char = {int(k): v for k, v in vocab["idx_to_char"].items()}

	# 2️⃣ Import model.py
	model_file = hf_hub_download("farbodpya/Persian-OCR", "model.py")
	spec_model = importlib.util.spec_from_file_location("model", model_file)
	model_module = importlib.util.module_from_spec(spec_model)
	sys.modules["model"] = model_module
	spec_model.loader.exec_module(model_module)
	from model import CNN_Transformer_OCR

	# 3️⃣ Import utils.py
	utils_file = hf_hub_download("farbodpya/Persian-OCR", "utils.py")
	spec_utils = importlib.util.spec_from_file_location("utils", utils_file)
	utils_module = importlib.util.module_from_spec(spec_utils)
	sys.modules["utils"] = utils_module
	spec_utils.loader.exec_module(utils_module)
	from utils import ocr_page

	# 4️⃣ Load model weights
	weights_path = hf_hub_download("farbodpya/Persian-OCR", "pytorch_model.bin")
	model = CNN_Transformer_OCR(num_classes=len(idx_to_char)+1)
	model.load_state_dict(torch.load(weights_path, map_location="cpu"))
	model.eval()

	# 5️⃣ Run OCR on an image
	img_path = "sample.png" # replace with your own image
	text = ocr_page(img_path, model, idx_to_char)
	print("\n=== Final OCR Page ===\n", text)