MrBERT-nos-gl-POS: Part-of-Speech Tagging for Galician

Fine-tuned version of MrBERT-nos-gl for morphosyntactic part-of-speech (POS) tagging in Galician. Developed as part of Proxecto Nós, an initiative to build language technology for the Galician language.

Model Details

Property Value
Base model proxectonos/MrBERT-nos-gl
Task Token classification (POS tagging)
Language Galician (gl)
License Apache 2.0
Tagset EAGLES (FreeLing/CTAG-compatible morphosyntactic tags)

Tagset

The model uses the EAGLES morphosyntactic tagset, standard for Iberian Romance languages. Tags are positional strings where each character encodes a grammatical attribute. The main categories are:

Code Category Example tag Decoded
A Adjective AQ0MS0 Qualitative, masculine, singular
C Conjunction CC / CS Coordinating / Subordinating
D Determiner DA0MS0 Article, masculine, singular
F Punctuation Fp / Fc Period / Comma
I Interjection I Interjection
N Noun NCMS0 Common, masculine, singular
P Pronoun PP3MS Personal, 3rd person, masc., sing.
R Adverb RG / RN General / Negative
S Adposition SPS00 Preposition, simple
V Verb VMIP3S0 Main, indicative, present, 3rd sing.
W Date/Time W Temporal expression
Z Numeral Z Number or quantity

Reading a tag — example from the output:

VIS3S00 = V (verb) + I (main) + S (past/preterite) + 3 (3rd person) + S (singular) + 00 (unspecified gender/unused).

Tags use 0 to mark attributes that are not applicable or unspecified for a given form.

Usage

Installation

pip install transformers torch

Quick start

from transformers import pipeline, AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("proxectonos/MrBERT-nos-gl-POS")
model = AutoModelForTokenClassification.from_pretrained("proxectonos/MrBERT-nos-gl-POS")
pos_tagger = pipeline(
    "token-classification",
    model=model,
    tokenizer=tokenizer,
    aggregation_strategy="simple",
)

text = "O gato negro durmiu tranquilamente sobre o sofá vermello."
results = pos_tagger(text)

for result in results:
    print(
        f"{result['word']:<20} [{result['entity_group']:<10}] {result['score']*100:.1f}%"
    )

Example output

Enter text for POS tagging: O gato negro durmiu tranquilamente sobre o sofá vermello.

O                    [GMS       ]  99.4% 
gato                 [NCMS0     ]  99.9% 
negro                [A0MS      ]  88.7% 
durmiu               [VIS3S00   ]  95.3% 
tranquilamente       [R0        ]  99.8% 
sobre                [S         ] 100.0% 
o                    [GMS       ]  99.7% 
sofá                 [NCMS0     ]  92.1% 
vermello             [A0MS      ]  85.5% 

Interactive CLI (optional)

For interactive exploration from the command line:

while True:
    text = input("Enter text for POS tagging: ").strip()
    if text.lower() in ["quit", "exit", "q"]:
        break
    results = pos_tagger(text)
    for r in results:
        bar = "█" * int(r['score'] * 20)
        print(f"  • {r['word']:<20} [{r['entity_group']:<10}] {r['score']*100:5.1f}%  {bar}")

Acknowledgements

This work is funded by the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the project Desarrollo de Modelos ALIA. (Esta publicación del proyecto Desarrollo de Modelos ALIA está financiada por el Ministerio para la Transformación Digital y de la Función Pública y por el Plan de Recuperación, Transformación y Resiliencia – Financiado por la Unión Europea – NextGenerationEU)

Citation

@misc{proxectenos2026MrBERT-nos-gl-pos,
  author       = {{Proxecto Nós}},
  title        = {{MrBERT-nos-gl-POS}: Part-of-Speech Tagging for Galician},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/proxectonos/MrBERT-nos-gl-POS}},
}
Downloads last month
42
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for proxectonos/MrBERT-nos-gl-POS

Base model

BSC-LT/MrBERT
Finetuned
(4)
this model

Collection including proxectonos/MrBERT-nos-gl-POS