larngear-antocr-crnn-th (v40)

Compact CRNN + CTC line recognizer for printed/typed Thai documents with Latin (English / European-accent) co-script. ~33 MB, 254-character charset. Reads one text-line crop β†’ string. Tuned for clean documents (forms, government reports, financial statements, academic calendars/timetables).

This repo is artifacts only β€” no inference code. It pairs with the recognizer in larngear_AntOCR (the _CRNN architecture + input normalization + CTC decode that these weights are shape-locked to), used in production by larngear-docling.

Files

File What
best.pt CRNN weights β€” plain state_dict, load with torch.load(weights_only=True)
classes.json charset, {"chars": [...]} β€” 254 chars (Thai + Latin + accents + digits + punctuation)

The two are a versioned pair: len(chars) + 1 (the +1 is the CTC blank) is the model's output dimension. A mismatched pair decodes to garbage β€” always pull both from the same revision.

Accuracy

Benchmark CER
AntOCR real-PDF line bench (3502 lines) ~4.3%
larngear-docling control corpus β€” region-aligned text CER (~20 pages, born-digital Thai PDFs) 2.34%

Out of scope (not trained for it): scene text, handwriting.

Usage

from huggingface_hub import hf_hub_download
from antocr.core import CRNNLineRecognizer   # from larngear_AntOCR

repo = "jsaksrisuwan/larngear_antocr_weight"
weights = hf_hub_download(repo, "best.pt", revision="main")
classes = hf_hub_download(repo, "classes.json", revision="main")

rec = CRNNLineRecognizer(weights=weights, classes=classes)
texts = rec.recognize_batch([line_crop0, line_crop1, ...])  # grayscale np.uint8 crops

Line segmentation (finding the line crops) is the consumer's job β€” larngear-docling does layout detection + line-crop before calling this. Pin revision to a tag or commit SHA for reproducible deploys.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support