Model Description

Meertje is intended as a dialect classifier, developed at the Meertens Institute to distinguish between dialect material and Standard Dutch. The model was trained on the Dialect Novel Corpus, using a subcorpus of linguistic material from Drenthe.

Intended Use

Isolating dialect material from Dutch texts containing both dialect and Standard Dutch.

Training Data

Sentences containing Drents vs. Standard Dutch sentences. Balanced train/dev/test at 2122/730/730.

Evaluation

Material	F1 (weighted avg)	support
Test set (Drents)	0.95	730
Drents	0.95	7362
Gronings	0.94	605
Twents	0.98	1496
Zeeuws-Vlaams	0.90	3231

Further Resources

Background article on the Meertje-project (NL or Eng)

Finetuning script (Colab Notebook)

Usage script (Colab Notebook)

Downloads last month: 4

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for nikkibyr/Meertje

Base model

GroNLP/bert-base-dutch-cased

Finetuned

(21)

this model