vorkna
/

palocr

Model card Files Files and versions

palocr / README.md

vorkna's picture

Update README.md

af112ec verified 5 months ago

|

history blame contribute delete

2.99 kB

	---
	language:
	- th
	- en
	metrics:
	- cer
	tags:
	- easyocr
	- image-to-text
	pipeline_tag: image-to-text
	library_name: easyocr
	license: apache-2.0
	---
	# PalOCR Model

	## Introduction

	PalOCR is a CRNN ('None-VGG-BiLSTM-CTC') model trained base from EasyOCR guideline with solely purpose of getting a better score of openthaigpt/thai-ocr-evaluation datasets due to limitation of author hardware.
	![Model Comparisons](https://github.com/clovaai/deep-text-recognition-benchmark/raw/master/figures/trade-off.png)

	## Training Dataset

	Generated images of openthaigpt/thai-ocr-evaluation datasets using [TextRecognitionDataGenerator](https://github.com/Belval/TextRecognitionDataGenerator).
	Which can be found at [palocr-datasets](https://huggingface.co/datasets/vorkna/palocr-datasets)

	## How to Use

	Here’s how to use this model with EasyOCR:
	Please download, extract and place palocr.py, palocr.yaml in the user_network_directory (default = ~/.EasyOCR/user_network) and place palocr.pth in model directory (default = ~/.EasyOCR/model) Once you place all 3 files in their respective places you can use this code to run model.

	```python
	import easyocr
	reader = easyocr.Reader(["th", "en"], gpu=True, recog_network="palocr")
	result = reader.readtext('text.jpg')
	```

	## Model Performance Comparison

	This section details the performance comparison between the open-source ThaiTrOCR model and other widely-used OCR systems, namely EasyOCR and Tesseract. The table below highlights their respective performance across various document types based on the average Character Error Rate (CER).


	\| Category \| EasyOCR \| PalOCR \| Tesseract \|
	\|:--------------\|-----------:\|-----------:\|-----------:\|
	\| real_document \| 0.220217 \| 0.960289 \| 0.915707 \|
	\| scene_text \| 0.35865 \| 1.0211 \| 2.408704 \|
	\| handwritten \| 0.409302 \| 1.01395 \| 1.032375 \|
	\| document \| 0.0871795 \| 0.946154 \| 0.761595 \|
	\| document_enth \| 0.275449 \| 0.916168 \| 1.061107 \|

	Disclaimer: While this model is train on generated images of evaluation datasets, It was train on roughly 1,000 of generated images.

	# Key Insights

	* Character Error Rate (CER): This metric evaluates the percentage of characters that were incorrectly predicted by the model. A lower CER indicates better performance. As shown in the table, ThaiTrOCR consistently outperforms EasyOCR and Tesseract across all document types, with a significantly lower average CER, making it the most accurate model in the comparison.
	* Tesseract Limitation: It’s important to note that Tesseract only supports single-language input at a time in this comparison. For the purposes of this benchmark, it was tested using only the Thai language setting, which might have contributed to its higher CER values.
	* The evaluation dataset is sourced from the [openthaigpt/thai-ocr-evaluation](https://huggingface.co/datasets/openthaigpt/thai-ocr-evaluation).

	## Authors

	- Vorakan Sumethsenee (vorkna@proton.me)