Talmud Punctuator β Model B (Steinsaltz)
Fine-tuned BEREL 3.0 for predicting punctuation in Talmudic Aramaic/Hebrew text.
Model B reflects the Steinsaltz/William Davidson Edition punctuation style, trained on 80,537 sentences across 36 masekhtot of the Babylonian Talmud.
Training details
- Base model: BEREL 3.0 (
dicta-il/BEREL_3.0) - Head: Linear classification (768 β 7 labels)
- Data: 80,537 sentences, 1,828,618 tokens
- Epochs: 5, Batch size: 32, LR: 2e-05
- Final loss: 0.2028
- Training time: 147 minutes
Labels
| Label | Meaning |
|---|---|
O |
No punctuation |
, |
Comma |
. |
Period |
: |
Colon |
; |
Semicolon |
? |
Question mark |
! |
Exclamation mark |
β |
Em-dash |
Usage
Use with the punctuator.py script from mivami.
Model tree for Joshua2/talmud-punctuator-B-full
Base model
dicta-il/BEREL_3.0