suchut
/

thaitrocr-base-handwritten-beta2

vision-encoder-decoder

image-text-to-text

Model card Files Files and versions

suchut commited on Sep 30, 2024

Commit

9bc78f2

·

verified ·

1 Parent(s): 7bbdabe

Create README.md

Files changed (1) hide show

README.md +61 -0

README.md ADDED Viewed

	@@ -0,0 +1,61 @@

+---
+language:
+- th
+- en
+metrics:
+- cer
+tags:
+- trocr
+- image-to-text
+pipeline_tag: image-to-text
+library_name: transformers
+license: apache-2.0
+---
+# Thai-TrOCR Model
+## 🚀 Final Model Available Now!
+**The final version of the Thai-TrOCR model is out!** Check it out here: [huggingface.com/openthaigpt/thai-trocr](https://huggingface.co/openthaigpt/thai-trocr)
+---
+## Introduction
+**Thai-TrOCR** is an advanced Optical Character Recognition (OCR) model fine-tuned specifically for recognizing handwritten text in **Thai** and **English**. Built on the robust TrOCR architecture, this model combines a Vision Transformer encoder with an Electra-based text decoder, allowing it to effectively handle multilingual text-line images.
+Designed for **efficiency and accuracy**, Thai-TrOCR is lightweight, making it ideal for deployment in resource-constrained environments without compromising on performance.
+### Key Features:
+- **Encoder**: TrOCR Base Handwritten
+- **Decoder**: Electra Small (Trained with Thai corpus)
+---
+## Training Dataset
+Thai-TrOCR was trained using the following datasets:
+- `pythainlp/thai-wiki-dataset-v3`
+- `pythainlp/thaigov-corpus`
+- `Salesforce/wikitext`
+---
+## How to Use This Beta Model
+Here’s a quick guide to get started with the Thai-TrOCR model in **PyTorch**:
+```python
+from transformers import TrOCRProcessor, VisionEncoderDecoderModel
+from PIL import Image
+import requests
+# Load processor and model
+processor = TrOCRProcessor.from_pretrained('suchut/thaitrocr-base-handwritten-beta2')
+model = VisionEncoderDecoderModel.from_pretrained('suchut/thaitrocr-base-handwritten-beta2')
+# Load an image
+url = 'your_image_url_here'
+image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
+# Process and generate text
+pixel_values = processor(images=image, return_tensors="pt").pixel_values
+generated_ids = model.generate(pixel_values)
+generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
+print(generated_text)
+```