danish-ner / README.md
thomasbeste's picture
Danish NER model v8 - 91.02% F1
a41628e verified
---
language: da
license: mit
tags:
- token-classification
- ner
- named-entity-recognition
- danish
- xlm-roberta
- scandinavian
datasets:
- alexandrainst/dane
- wikiann
- tollefj/nordic-ner
metrics:
- f1
- precision
- recall
pipeline_tag: token-classification
model-index:
- name: danish-ner-xlmr-base
results:
- task:
type: token-classification
name: Named Entity Recognition
dataset:
name: DaNE
type: alexandrainst/dane
split: validation
metrics:
- name: F1
type: f1
value: 0.9102
---
# Danish NER XLM-RoBERTa (v8)
State-of-the-art Named Entity Recognition model for Danish, fine-tuned from XLM-RoBERTa.
**Updated 2026-02-03**: Now v8 with 91.02% F1 (previously 84.6%)
## Performance
| Benchmark | F1 Score |
|-----------|----------|
| **DaNE (validation)** | **91.02%** |
| Previous version | 84.6% |
| nbailab baseline | 87.09% |
## Quick Start
```python
from transformers import pipeline
ner = pipeline("ner", model="thomasbeste/danish-ner-xlmr-base", aggregation_strategy="simple")
result = ner("Anders Jensen arbejder hos Novo Nordisk i København.")
for entity in result:
print(f"{entity['word']}: {entity['entity_group']} ({entity['score']:.2f})")
```
## Entity Types
| Label | Description | Example |
|-------|-------------|---------|
| `PER` | Person names | Anders Jensen |
| `ORG` | Organizations | Novo Nordisk A/S |
| `LOC` | Locations | København |
| `MISC` | Miscellaneous | Dansk |
## Training Data
- DaNE (4.4k samples)
- WikiANN Danish (20k samples)
- NorNE Norwegian (30k samples)
- High-quality synthetic data (60k samples)
## License
MIT