princeton-logion
/

logion-bert-base

Model card Files Files and versions

jbmurel commited on Jul 2, 2025

Commit

bc75444

·

verified ·

1 Parent(s): e870209

Update bert-base card

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -6,7 +6,9 @@ pipeline_tag: fill-mask
 ---
 # Logion base model
-BERT-based model pretrained on largest set of pre-modern Greek to-date (70+ million words). It was introduced in this [paper](https://aclanthology.org/2023.alp-1.20/). This model ignores cases and accents/diacritics.
 ## How to use

 ---
 # Logion base model
+BERT-based model pretrained on largest set of pre-modern Greek to-date. It was introduced in this [paper](https://aclanthology.org/2023.alp-1.20/).
+The model uses a WordPiece tokenizer (vocab size of 50,000) on a corpus of over 70 million words (over 95 million tokens) of premodern Greek. This model ignores cases and accents/diacritics.
 ## How to use