You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Dhivehi TrOCR Base V6

A fine-tuned TrOCR model for Dhivehi (Maldivian) text recognition using Thaana script.

Model Details

  • Base model: microsoft/trocr-base-handwritten
  • Parameters: ~334M
  • Training data: ~695K samples (315K dhivehi-image-text + 380K dhivehi-vrd)
  • Best CER: 0.9% (checkpoint-20000)
  • Character tokenizer: WordLevel (character-level) with EOS

Usage

from transformers import TrOCRProcessor, VisionEncoderDecoderModel, PreTrainedTokenizerFast
from PIL import Image
import torch

processor = TrOCRProcessor.from_pretrained("Serialtechlab/dhivehi-trocr-base-handwritten")
model = VisionEncoderDecoderModel.from_pretrained("Serialtechlab/dhivehi-trocr-base-handwritten")
tokenizer = PreTrainedTokenizerFast.from_pretrained("Serialtechlab/dhivehi-trocr-base-handwritten")

image = Image.open("dhivehi_text.png").convert("RGB")
pixel_values = processor(image, return_tensors='pt').pixel_values

with torch.no_grad():
    generated_ids = model.generate(pixel_values, max_length=128, num_beams=4)

tokens = tokenizer.convert_ids_to_tokens(generated_ids[0])
special = [tokenizer.pad_token, tokenizer.bos_token, tokenizer.eos_token, tokenizer.unk_token]
text = "".join([t for t in tokens if t not in special])
print(text)

Training

Trained from scratch on Google Colab (A100) for 6 epochs with:

  • Learning rate: 4e-5
  • Batch size: 16
  • EOS token appended to all labels
  • Proper PAD token masking (-100)
  • Character-level WordLevel tokenizer

Limitations

  • Optimized for single text line images (use a text detector like Surya for full pages)
  • May truncate very long lines (max_length=128 characters)
  • Best results on printed Dhivehi text; handwritten accuracy varies by style
Downloads last month
92
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Serialtechlab/dhivehi-trocr-base-handwritten

Finetuned
(30)
this model

Datasets used to train Serialtechlab/dhivehi-trocr-base-handwritten