eloukas commited on
Commit ·
5ddcf57
1
Parent(s): f8e8efa
Update README with COLING 2025 paper and missing references
Browse filesAdd the primary COLING 2025 publication with BibTeX citation,
the Greeklish-to-Greek (LREC-COLING 2024) reference, and
improve formatting with proper markdown links and bold labels.
README.md
CHANGED
|
@@ -4,19 +4,61 @@ language:
|
|
| 4 |
- el
|
| 5 |
pipeline_tag: token-classification
|
| 6 |
---
|
| 7 |
-
This repository contains the models used in the gr-nlp-toolkit
|
| 8 |
-
|
| 9 |
The toolkit includes the following models, all designed specifically for processing the Greek language:
|
| 10 |
-
- Named Entity Recognition (NER): Identifies and classifies named entities in Greek text,
|
| 11 |
such as names of people, organizations, and locations.
|
| 12 |
-
- Dependency Parsing (DP): Analyzes the grammatical structure of Greek sentences by identifying relationships between words and their dependencies.
|
| 13 |
-
- Part Of Speech tagging (POS): Tags each word in Greek text with its corresponding part of speech (e.g., noun, verb, adjective), along with its morphological features.
|
| 14 |
|
| 15 |
**Note:**
|
| 16 |
These models cannot be used as standalone tools; they are integrated into the gr-nlp-toolkit and can only be utilized through it.
|
| 17 |
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
-
|
|
|
|
| 4 |
- el
|
| 5 |
pipeline_tag: token-classification
|
| 6 |
---
|
| 7 |
+
This repository contains the models used in the [gr-nlp-toolkit](https://github.com/nlpaueb/gr-nlp-toolkit) project.
|
| 8 |
+
|
| 9 |
The toolkit includes the following models, all designed specifically for processing the Greek language:
|
| 10 |
+
- **Named Entity Recognition (NER)**: Identifies and classifies named entities in Greek text,
|
| 11 |
such as names of people, organizations, and locations.
|
| 12 |
+
- **Dependency Parsing (DP)**: Analyzes the grammatical structure of Greek sentences by identifying relationships between words and their dependencies.
|
| 13 |
+
- **Part Of Speech tagging (POS)**: Tags each word in Greek text with its corresponding part of speech (e.g., noun, verb, adjective), along with its morphological features.
|
| 14 |
|
| 15 |
**Note:**
|
| 16 |
These models cannot be used as standalone tools; they are integrated into the gr-nlp-toolkit and can only be utilized through it.
|
| 17 |
|
| 18 |
+
## Paper
|
| 19 |
+
|
| 20 |
+
The software was presented as a paper at **COLING 2025**.
|
| 21 |
+
Read the full technical report/paper here: [https://aclanthology.org/2025.coling-demos.17/](https://aclanthology.org/2025.coling-demos.17/)
|
| 22 |
+
|
| 23 |
+
If you use our toolkit, please cite it:
|
| 24 |
+
```bibtex
|
| 25 |
+
@inproceedings{loukas-etal-coling2025-greek-nlp-toolkit,
|
| 26 |
+
title = "{GR}-{NLP}-{TOOLKIT}: An Open-Source {NLP} Toolkit for {M}odern {G}reek",
|
| 27 |
+
author = "Loukas, Lefteris and
|
| 28 |
+
Smyrnioudis, Nikolaos and
|
| 29 |
+
Dikonomaki, Chrysa and
|
| 30 |
+
Barbakos, Spiros and
|
| 31 |
+
Toumazatos, Anastasios and
|
| 32 |
+
Koutsikakis, John and
|
| 33 |
+
Kyriakakis, Manolis and
|
| 34 |
+
Georgiou, Mary and
|
| 35 |
+
Vassos, Stavros and
|
| 36 |
+
Pavlopoulos, John and
|
| 37 |
+
Androutsopoulos, Ion",
|
| 38 |
+
editor = "Rambow, Owen and
|
| 39 |
+
Wanner, Leo and
|
| 40 |
+
Apidianaki, Marianna and
|
| 41 |
+
Al-Khalifa, Hend and
|
| 42 |
+
Eugenio, Barbara Di and
|
| 43 |
+
Schockaert, Steven and
|
| 44 |
+
Mather, Brodie and
|
| 45 |
+
Dras, Mark",
|
| 46 |
+
booktitle = "Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations",
|
| 47 |
+
month = jan,
|
| 48 |
+
year = "2025",
|
| 49 |
+
address = "Abu Dhabi, UAE",
|
| 50 |
+
publisher = "Association for Computational Linguistics",
|
| 51 |
+
url = "https://aclanthology.org/2025.coling-demos.17/",
|
| 52 |
+
pages = "174--182",
|
| 53 |
+
}
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
## References
|
| 57 |
+
|
| 58 |
+
While many methodology details are shared in the [GR-NLP-TOOLKIT paper publication @ COLING 2025 (see above)](https://aclanthology.org/2025.coling-demos.17/), additional research details can be found here:
|
| 59 |
+
|
| 60 |
+
1. C. Dikonimaki, "A Transformer-based natural language processing toolkit for Greek -- Part of speech tagging and dependency parsing", BSc thesis, Department of Informatics, Athens University of Economics and Business, 2021. http://nlp.cs.aueb.gr/theses/dikonimaki_bsc_thesis.pdf *(POS/DP/Morphological tagging processor)*
|
| 61 |
|
| 62 |
+
2. N. Smyrnioudis, "A Transformer-based natural language processing toolkit for Greek -- Named entity recognition and multi-task learning", BSc thesis, Department of Informatics, Athens University of Economics and Business, 2021. http://nlp.cs.aueb.gr/theses/smyrnioudis_bsc_thesis.pdf *(NER processor)*
|
| 63 |
|
| 64 |
+
3. A. Toumazatos, J. Pavlopoulos, I. Androutsopoulos, & S. Vassos, "Still All Greeklish to Me: Greeklish to Greek Transliteration." In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (pp. 15309-15319). https://aclanthology.org/2024.lrec-main.1330/ *(Greeklish-to-Greek processor)*
|