Dilemma Model Weights
Character-level transformer for Greek lemmatization, used as the neural fallback in the Dilemma lemmatizer.
Model Details
- Architecture: Encoder-decoder transformer (character-level)
- Parameters: 4.2M
- d_model: 256
- Attention heads: 4
- Layers: 3 (encoder and decoder)
- Feed-forward dim: 512
- Vocabulary: 381 characters (Greek polytonic + special tokens)
- Training data: 3.4M form-lemma pairs from Wiktionary inflection tables
- Multi-task heads: POS tagging (10 tags), nominal inflection (45 labels), verbal inflection (69 labels)
Files
| File | Size | Description |
|---|---|---|
model.pt |
~16 MB | PyTorch checkpoint (weights + config) |
encoder.onnx |
~7 MB | ONNX encoder for lightweight inference |
decoder_step.onnx |
~10 MB | ONNX decoder for lightweight inference |
vocab.json |
~9 KB | Character vocabulary (char2id / id2char mappings) |
Usage
This model is used automatically by the Dilemma library as a fallback for forms not found in the 12.3M-entry lookup table or resolved by rule-based morphological analysis. Only about 5% of Greek words reach the transformer.
pip install dilemma
from dilemma import Dilemma
d = Dilemma()
# The transformer is invoked automatically when needed
d.lemmatize("ἐποιήσαντο", lang="grc") # -> ποιέω
ONNX vs PyTorch
For inference, ONNX Runtime (50 MB install) and PyTorch (2 GB install) produce identical results. The ONNX files are provided for environments where a lighter dependency is preferred. PyTorch is only needed for training.
Training
Trained from scratch in minutes on a single GPU using the train.py script in the Dilemma repository:
python train.py
python export_onnx.py # Export to ONNX format
License
MIT
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support