opla / README.md
ciscoriordan's picture
Upload README.md with huggingface_hub
694725e verified
metadata
language:
  - grc
  - el
license: mit
tags:
  - pos-tagging
  - dependency-parsing
  - ancient-greek
  - greek
  - nlp
  - onnx
library_name: transformers
pipeline_tag: token-classification

Opla - Greek POS Tagger and Dependency Parser

GPU-optimized Greek POS tagger and dependency parser. 215x faster than gr-nlp-toolkit on real-world Greek text, with identical POS output and near-identical dependency parsing.

Supports Modern Greek (el), Ancient Greek (grc), and Medieval Greek (med).

Source code: github.com/ciscoriordan/opla

Weights

File Language Size Description
weights/grc/opla_grc.pt Ancient Greek 632 MB PyTorch checkpoint (joint POS+DP on Ancient-Greek-BERT)
weights/grc/onnx/opla_joint.onnx Ancient Greek 535 MB ONNX model for CPU deployment (with .data and meta.json)
weights/med/opla_med.pt Medieval Greek ~632 MB PyTorch checkpoint (joint POS+DP on Ancient-Greek-BERT)

Modern Greek weights are loaded directly from AUEB-NLP/gr-nlp-toolkit.

Ancient Greek accuracy

Dev set accuracy on combined Perseus + PROIEL + Gorman treebanks (1.1M tokens):

Metric Accuracy
UPOS 96.8%
DEPREL 91.8%

Training data:

Usage

Architecture

The grc and med models use a single Ancient-Greek-BERT backbone with jointly trained POS and DP heads, requiring only one BERT forward pass per batch. The el model uses dual GreekBERT backbones (inherited from gr-nlp-toolkit).

ONNX inference

The ONNX model provides CPU-only deployment without requiring PyTorch. Install onnxruntime and pass checkpoint="onnx" to Opla().

Citation

License

MIT