danish-ner / README.md
thomasbeste's picture
Danish NER model v8 - 91.02% F1
a41628e verified
metadata
language: da
license: mit
tags:
  - token-classification
  - ner
  - named-entity-recognition
  - danish
  - xlm-roberta
  - scandinavian
datasets:
  - alexandrainst/dane
  - wikiann
  - tollefj/nordic-ner
metrics:
  - f1
  - precision
  - recall
pipeline_tag: token-classification
model-index:
  - name: danish-ner-xlmr-base
    results:
      - task:
          type: token-classification
          name: Named Entity Recognition
        dataset:
          name: DaNE
          type: alexandrainst/dane
          split: validation
        metrics:
          - name: F1
            type: f1
            value: 0.9102

Danish NER XLM-RoBERTa (v8)

State-of-the-art Named Entity Recognition model for Danish, fine-tuned from XLM-RoBERTa.

Updated 2026-02-03: Now v8 with 91.02% F1 (previously 84.6%)

Performance

Benchmark F1 Score
DaNE (validation) 91.02%
Previous version 84.6%
nbailab baseline 87.09%

Quick Start

from transformers import pipeline

ner = pipeline("ner", model="thomasbeste/danish-ner-xlmr-base", aggregation_strategy="simple")
result = ner("Anders Jensen arbejder hos Novo Nordisk i København.")

for entity in result:
    print(f"{entity['word']}: {entity['entity_group']} ({entity['score']:.2f})")

Entity Types

Label Description Example
PER Person names Anders Jensen
ORG Organizations Novo Nordisk A/S
LOC Locations København
MISC Miscellaneous Dansk

Training Data

  • DaNE (4.4k samples)
  • WikiANN Danish (20k samples)
  • NorNE Norwegian (30k samples)
  • High-quality synthetic data (60k samples)

License

MIT