| | --- |
| | title: README |
| | emoji: 👀 |
| | colorFrom: blue |
| | colorTo: yellow |
| | sdk: static |
| | pinned: false |
| | --- |
| | |
| | # **GlossAPI** |
| |
|
| | GlossAPI is a project by [GFOSS – Open Technologies Alliance](https://gfoss.eu), focused on building foundational infrastructure for Greek Natural Language Processing. Our work centers on the **creation of high-quality, open-access datasets** and the development of a robust, modular **processing pipeline** tailored for academic and domain-specific documents. |
| |
|
| | We aim to lay the groundwork for **open, collaborative, and reproducible NLP research** in the Greek language, supporting researchers, students, and developers in the digital humanities, computational linguistics, and AI communities. |
| |
|
| | Our pipeline covers every stage of document processing—from **automated downloading and text extraction**, to **section segmentation, classification**, and **annotation**. It supports documents in multiple formats and includes dedicated tools for Greek-language content, preserving structure and metadata throughout. |
| |
|
| | GlossAPI contributes to the long-term vision of a sustainable, open ecosystem for Greek NLP by: |
| | - Publishing open-source tools and datasets under permissive licenses |
| | - Promoting interoperability and data transparency |
| | - Encouraging community contributions and reuse |
| |
|
| | 📂 All datasets are released under **Creative Commons licenses**, and our source code is publicly available on [GitHub](https://github.com/eellak/glossapi). |
| |
|
| | 📬 Contact: glossapi.team@eellak.gr |
| |
|