Instructions to use colinswaelens/DBBErt with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use colinswaelens/DBBErt with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="colinswaelens/DBBErt")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("colinswaelens/DBBErt") model = AutoModelForMaskedLM.from_pretrained("colinswaelens/DBBErt") - Notebooks
- Google Colab
- Kaggle
DBBErt: A BERT-based Language Model for Byzantine Greek
π Model Description
DBBErt is a transformer-based language model fine-tuned for Byzantine Greek, trained on data from the Database of Byzantine Book Epigrams (DBBE).
It supports tasks such as:
- Part-of-speech tagging
- Morphological analysis
- Lemmatization
The model is designed to process both Greek texts from critical editions and unedited medieval Greek texts, which are characterised by:
- Non-standard orthography
- Dialectal and diachronic variation
- Manuscript-based transcription conventions
π οΈ How to Use
Example with π€ Transformers:
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("coswaele/DBBErt")
model = AutoModelForTokenClassification.from_pretrained("coswaele/DBBErt")
text = "αΌΞ½ ΟΞΏαΏΟ βιβλίοιΟ"
tokens = tokenizer(text, return_tensors="pt")
outputs = model(**tokens)
π Citation
If you use DBBErt in your research, please cite:
@article{swaelens_lre,
author = {Swaelens, Colin and De Vos, Ilse and Lefever, Els},
title = {Linguistic annotation of Byzantine book epigrams},
journal = {Language Resources and Evaluation},
year = {2025},
volume = {59},
number = {1},
pages = {109--134},
doi = {10.1007/s10579-023-09703-x},
url = {https://doi.org/10.1007/s10579-023-09703-x}
}
- Downloads last month
- 44