Model Description

Meertje is intended as a dialect classifier, developed at the Meertens Institute to distinguish between dialect material and Standard Dutch. The model was trained on the Dialect Novel Corpus, using a subcorpus of linguistic material from Drenthe.

Intended Use

Isolating dialect material from Dutch texts containing both dialect and Standard Dutch.

Training Data

Sentences containing Drents vs. Standard Dutch sentences. Balanced train/dev/test at 2122/730/730.

Evaluation

Material F1 (weighted avg) support
Test set (Drents) 0.95 730
Drents 0.95 7362
Gronings 0.94 605
Twents 0.98 1496
Zeeuws-Vlaams 0.90 3231

Further Resources

Background article on the Meertje-project (NL or Eng)

Finetuning script (Colab Notebook)

Usage script (Colab Notebook)

Downloads last month
43
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nikkibyr/Meertje

Finetuned
(21)
this model