Model Description
Meertje is intended as a dialect classifier, developed at the Meertens Institute to distinguish between dialect material and Standard Dutch. The model was trained on the Dialect Novel Corpus, using a subcorpus of linguistic material from Drenthe.
Intended Use
Isolating dialect material from Dutch texts containing both dialect and Standard Dutch.
Training Data
Sentences containing Drents vs. Standard Dutch sentences. Balanced train/dev/test at 2122/730/730.
Evaluation
| Material | F1 (weighted avg) | support |
|---|---|---|
| Test set (Drents) | 0.95 | 730 |
| Drents | 0.95 | 7362 |
| Gronings | 0.94 | 605 |
| Twents | 0.98 | 1496 |
| Zeeuws-Vlaams | 0.90 | 3231 |
Further Resources
Background article on the Meertje-project (NL or Eng)
Finetuning script (Colab Notebook)
Usage script (Colab Notebook)
- Downloads last month
- 43
Model tree for nikkibyr/Meertje
Base model
GroNLP/bert-base-dutch-cased