mmBERT NER model for slavic languages

The train / eval / test splits were concatenated from all languages in order as specified in command line:
sl, hr, sr, bs, mk, sq, cs, bg, pl, ru, sk, uk

We used the following hyper-parameters:

  • PyTorch's AdamW algorithm with 2e-5 learning rate
  • batch size of 32
  • 30 epochs (preliminary runs showed best F1-scores between epochs 15 and 35)
  • F1-score for best model selection and training progression.

Based on Analysis of Transfer Learning for Named Entity Recognition in South-Slavic Languages (Ivačič et al., BSNLP 2023)

Downloads last month
4
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results