--- title: Lesbian Greek Morphosyntactic Parser emoji: 🔍 colorFrom: blue colorTo: green sdk: gradio sdk_version: 5.35.0 app_file: app.py pinned: false license: cc-by-4.0 --- # Lesbian Greek Morphosyntactic Parser A Hugging Face Space for parsing dialectal Greek text from the island of Lesbos using the Lesbian Greek Morphosyntactic Model developed by Bompolas et al. (2025). ## Overview This interactive parser provides morphosyntactic analysis for the Lesbian dialect of Greek, offering: - **Part-of-speech tagging** - **Morphological analysis** - **Dependency parsing** - **Lemmatization** - **CoNLL-U format output** ## Model Details The underlying model is based on: - **Stanza v1.7.0+** as the base pipeline - **Greek BERT** ([nlpaueb/bert-base-greek-uncased-v1](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1)) for enhanced representations - **UD_Greek-Lesbian treebank** for training (540 sentences) ### Training Data Sources **Oral Data** (collected 2023-2024): - Speakers from Agra, Chidira, Eressos, Pterounta, Mesotopos, and Parakoila villages on Lesbos **Written Sources**: - Papanis, D. & Papanis, G. D. (2004). *Lexiko tou Agiasotikou Glosikou Idiomatos* - Tsokarou-Mitsioni, E. (1998). *Palies Istories ap' tn Agiasiou* - Tsokarou-Mitsioni, E. (2019). *Prosfygiá* - Anagnostopoulou, M. A. (2021). *Thematiko Lexiko tis Lesviakis Dialektou* - Anagnostou, V. T. (2014). *Tsi sta th'ka mas: Komodia sta k'stariot'ka* ## Features ### 📊 **CoNLL-U Output** Standard Universal Dependencies format for interoperability with linguistic tools ### 📈 **Interactive Data Table** Browse parsed tokens with all linguistic features (POS, morphology, dependencies) ### 🔗 **Dependency Visualization** Text-based visualization showing syntactic relationships between words ### 🏛️ **Dialectal Specialization** Optimized specifically for the Lesbian dialect of Greek ## Usage 1. Enter your Lesbian Greek text in the input field 2. Click "Parse Lesbian Greek Text" or press Enter 3. View results in three formats: - Raw CoNLL-U output (copyable) - Interactive data table - Dependency structure visualization ## Example Texts The interface includes example texts based on the dialectal sources used in training: - `Το παιδί κάθεται στο σπίτι.` - `Η μάνα μαγειρεύει στην κουζίνα.` - `Το νερό τρέχει απ' τη βρύση.` - `Οι παππούδες λένε παλιές ιστορίες.` ## Limitations - **Experimental model**: Due to limited training data (540 sentences) - **Domain-specific**: Optimized for dialectal content similar to training sources - **Research purposes**: Further fine-tuning needed for production use ## Citation If you use this tool or the underlying model, please cite: ```bibtex @inproceedings{bompolas2025crossing, title={Crossing Dialectal Boundaries: Building a Treebank for the Dialect of Lesbos through Knowledge Transfer from Standard Modern Greek}, author={Bompolas, Stavros and Markantonatou, Stella and Ralli, Angela and Anastasopoulos, Antonios}, booktitle={Proceedings of the 8th Universal Dependencies Workshop (UDW, SyntaxFest 2025)}, year={2025}, publisher={Association for Computational Linguistics} } ``` ## Related Resources - 🤗 [Lesbian Greek Morphosyntactic Model](https://huggingface.co/sbompolas/Lesbian-Greek-Morphosyntactic-Model) - 📚 [UD_Greek-Lesbian Treebank](https://github.com/UniversalDependencies/UD_Greek-Lesbian) - 🔧 [Stanza Documentation](https://stanfordnlp.github.io/stanza/) - 🇬🇷 [Greek BERT Model](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1) ## Technical Details ### Dependencies - `gradio>=4.0.0` - Web interface - `stanza>=1.7.0` - NLP pipeline - `pandas>=1.5.0` - Data handling - `torch>=1.9.0` - Neural network backend - `transformers>=4.20.0` - BERT integration ### File Structure ``` ├── app.py # Main Gradio application ├── requirements.txt # Python dependencies └── README.md # This documentation ``` ## Development To run locally: ```bash git clone cd pip install -r requirements.txt python app.py ``` ## Support For issues related to: - **The model**: Contact the original authors or open an issue on the model repository - **This Space**: Open an issue in the Space's discussion tab - **Stanza**: Refer to the [Stanza documentation](https://stanfordnlp.github.io/stanza/) ## License Please refer to the original model's license terms and the individual component licenses (Stanza, Greek BERT, etc.).