larngear-antocr-crnn-th (v40)

Compact CRNN + CTC line recognizer for printed/typed Thai documents with Latin (English / European-accent) co-script. ~33 MB, 254-character charset. Reads one text-line crop → string. Tuned for clean documents (forms, government reports, financial statements, academic calendars/timetables).

This repo is artifacts only — no inference code. It pairs with the recognizer in larngear_AntOCR (the _CRNN architecture + input normalization + CTC decode that these weights are shape-locked to), used in production by larngear-docling.

Files

File	What
`best.pt`	CRNN weights — plain `state_dict`, load with `torch.load(weights_only=True)`
`classes.json`	charset, `{"chars": [...]}` — 254 chars (Thai + Latin + accents + digits + punctuation)

The two are a versioned pair: len(chars) + 1 (the +1 is the CTC blank) is the model's output dimension. A mismatched pair decodes to garbage — always pull both from the same revision.

Accuracy

Benchmark	CER
AntOCR real-PDF line bench (3502 lines)	~4.3%
`larngear-docling` control corpus — region-aligned text CER (~20 pages, born-digital Thai PDFs)	2.34%

Out of scope (not trained for it): scene text, handwriting.

Usage

from huggingface_hub import hf_hub_download
from antocr.core import CRNNLineRecognizer   # from larngear_AntOCR

repo = "jsaksrisuwan/larngear_antocr_weight"
weights = hf_hub_download(repo, "best.pt", revision="main")
classes = hf_hub_download(repo, "classes.json", revision="main")

rec = CRNNLineRecognizer(weights=weights, classes=classes)
texts = rec.recognize_batch([line_crop0, line_crop1, ...])  # grayscale np.uint8 crops

Line segmentation (finding the line crops) is the consumer's job — larngear-docling does layout detection + line-crop before calling this. Pin revision to a tag or commit SHA for reproducible deploys.

Downloads last month: -; Downloads are not tracked for this model. How to track