File size: 1,691 Bytes
856e7e5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
---
language:
- da
license: apache-2.0
tags:
- token-classification
- ner
- danish
- modernbert
datasets:
- alexandrainst/dane
metrics:
- f1
- precision
- recall
pipeline_tag: token-classification
widget:
- text: "Jens Peter Hansen bor i København og arbejder hos Novo Nordisk."
---
# ModernBERT Danish NER (Base)
Danish Named Entity Recognition model fine-tuned from [`AI-Sweden-Models/ModernBERT-base`](https://huggingface.co/AI-Sweden-Models/ModernBERT-base) on the [DaNE](https://huggingface.co/datasets/alexandrainst/dane) dataset.
## Benchmark: DaNE Test Set
| Entity | Precision | Recall | F1 | Support |
|--------|-----------|--------|----|---------|
| PER | 0.8962 | 0.9061 | 0.9011 | 181 |
| ORG | 0.6929 | 0.6299 | 0.6599 | 154 |
| LOC | 0.7500 | 0.8969 | 0.8169 | 97 |
| MISC | 0.4878 | 0.6316 | 0.5505 | 95 |
| **micro avg** | **0.7260** | **0.7742** | **0.7493** | |
## Entity Types
- **PER**: Person names
- **ORG**: Organizations
- **LOC**: Locations
- **MISC**: Miscellaneous entities
## Usage
```python
from transformers import pipeline
ner = pipeline("ner", model="thomasbeste/modernbert-da-ner-base", aggregation_strategy="simple")
results = ner("Jens Peter Hansen bor i København og arbejder hos Novo Nordisk.")
for entity in results:
print(f"{entity['word']}: {entity['entity_group']} ({entity['score']:.3f})")
```
## Training Details
- **Base model**: AI-Sweden-Models/ModernBERT-base
- **Dataset**: DaNE (alexandrainst/dane) — 4,383 train / 564 val / 565 test sentences
- **Epochs**: 10
- **Learning rate**: 2e-5
- **Batch size**: 16
- **Optimizer**: AdamW (weight decay 0.01, warmup ratio 0.1)
- **Precision**: bf16
- **Max sequence length**: 256
|