|
|
--- |
|
|
license: cc-by-4.0 |
|
|
language: |
|
|
- el |
|
|
pipeline_tag: token-classification |
|
|
library_name: stanza |
|
|
tags: |
|
|
- Greek dialect |
|
|
--- |
|
|
|
|
|
|
|
|
# 🏛️ UD Greek-Lesbian Dialect Treebank |
|
|
|
|
|
The **Lesbos dialect** belongs to the **Northern Modern Greek** group, characterized by distinctive phonological features collectively known as **Northern vocalism**. |
|
|
This dataset represents the **first Universal Dependencies treebank** for a Northern Greek dialect, offering valuable insights into the structure and vitality of Lesbian Greek. |
|
|
|
|
|
--- |
|
|
|
|
|
## 🗣️ Linguistic Background |
|
|
|
|
|
The **Lesbos dialect** exhibits the following hallmark phonological features: |
|
|
|
|
|
- **Raising of unstressed mid vowels** /e/ → [i] and /o/ → [u] |
|
|
e.g., *πιδί* [piˈði] instead of SMG *παιδί* [peˈði] “child” |
|
|
*κάτου* [ˈkatu] instead of SMG *κάτω* [ˈkato] “down” |
|
|
|
|
|
- **Deletion of unstressed high vowels** /i/, /u/ |
|
|
e.g., *φίδ* [ˈfið] instead of SMG *φίδι* [ˈfiði] “snake” |
|
|
*βνό* [ˈvno] instead of SMG *βουνό* [vuˈno] “mountain” |
|
|
|
|
|
These features distinguish Lesbian Greek from southern dialects, including **Standard Modern Greek (SMG)**. |
|
|
Historically, the dialect has been shaped by extensive contact with **Venetian** (1355–1462) and **Turkish** (1462–1912), resulting in numerous **loanwords** and **morphological borrowings**. |
|
|
Unlike many Modern Greek dialects, **Lesbian Greek remains a living variety**, actively spoken across the island. |
|
|
|
|
|
--- |
|
|
|
|
|
## 📚 Dataset Summary |
|
|
|
|
|
This resource contains: |
|
|
|
|
|
- **270 sentences** |
|
|
- **3,603 tokens** |
|
|
- Follows **Universal Dependencies v2** annotations |
|
|
|
|
|
It includes morphological, syntactic, and lemmatization layers, aligning with the **UD schema**. |
|
|
|
|
|
--- |
|
|
|
|
|
## 📊 Model Performance |
|
|
|
|
|
| **Metric** | **Accuracy (%)** | |
|
|
|-------------|----------------:| |
|
|
| UPOS | 89.62 | |
|
|
| UFeats | 71.86 | |
|
|
| AllTags | 70.22 | |
|
|
| Lemmas | 68.03 | |
|
|
| UAS | 82.51 | |
|
|
| LAS | 68.85 | |
|
|
| CLAS | 56.81 | |
|
|
| MLAS | 39.44 | |
|
|
| BLEX | 38.50 | |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
To cite this work or read more about the training pipeline, see: |
|
|
|
|
|
Stavros Bompolas, Stella Markantonatou, Angela Ralli, and Antonios Anastasopoulos. (2025). |
|
|
Crossing Dialectal Boundaries: Building a Treebank for the Dialect of Lesbos through Knowledge Transfer from Standard Modern Greek. |
|
|
In Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025), Ljubljana, Slovenia. Association for Computational Linguistics. |
|
|
|
|
|
|
|
|
|