Opla - Greek POS Tagger and Dependency Parser

GPU-optimized Greek POS tagger and dependency parser. 215x faster than gr-nlp-toolkit on real-world Greek text, with identical POS output and near-identical dependency parsing.

Supports Modern Greek (el), Ancient Greek (grc), and Medieval Greek (med).

Source code: github.com/ciscoriordan/opla

Weights

File	Language	Size	Description
`weights/grc/opla_grc.pt`	Ancient Greek	632 MB	PyTorch checkpoint (joint POS+DP on Ancient-Greek-BERT)
`weights/grc/onnx/opla_joint.onnx`	Ancient Greek	535 MB	ONNX model for CPU deployment (with `.data` and `meta.json`)
`weights/med/opla_med.pt`	Medieval Greek	~632 MB	PyTorch checkpoint (joint POS+DP on Ancient-Greek-BERT)

Modern Greek weights are loaded directly from AUEB-NLP/gr-nlp-toolkit.

Ancient Greek accuracy

Dev set accuracy on combined Perseus + PROIEL + Gorman treebanks (1.1M tokens):

Metric	Accuracy
UPOS	96.8%
DEPREL	91.8%

Training data:

UD_Ancient_Greek-Perseus (203K tokens)
UD_Ancient_Greek-PROIEL (214K tokens)
Gorman Greek Dependency Trees (692K tokens)
DiGreC (103K tokens, fine-tuning)

Usage

Architecture

The grc and med models use a single Ancient-Greek-BERT backbone with jointly trained POS and DP heads, requiring only one BERT forward pass per batch. The el model uses dual GreekBERT backbones (inherited from gr-nlp-toolkit).

ONNX inference

The ONNX model provides CPU-only deployment without requiring PyTorch. Install onnxruntime and pass checkpoint="onnx" to Opla().

Citation

License

MIT

Downloads last month: -; Downloads are not tracked for this model. How to track