|
|
--- |
|
|
language: |
|
|
- da |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- token-classification |
|
|
- ner |
|
|
- danish |
|
|
- modernbert |
|
|
datasets: |
|
|
- alexandrainst/dane |
|
|
metrics: |
|
|
- f1 |
|
|
- precision |
|
|
- recall |
|
|
pipeline_tag: token-classification |
|
|
widget: |
|
|
- text: "Jens Peter Hansen bor i København og arbejder hos Novo Nordisk." |
|
|
--- |
|
|
|
|
|
# ModernBERT Danish NER (Base) |
|
|
|
|
|
Danish Named Entity Recognition model fine-tuned from [`AI-Sweden-Models/ModernBERT-base`](https://huggingface.co/AI-Sweden-Models/ModernBERT-base) on the [DaNE](https://huggingface.co/datasets/alexandrainst/dane) dataset. |
|
|
|
|
|
## Benchmark: DaNE Test Set |
|
|
|
|
|
| Entity | Precision | Recall | F1 | Support | |
|
|
|--------|-----------|--------|----|---------| |
|
|
| PER | 0.8962 | 0.9061 | 0.9011 | 181 | |
|
|
| ORG | 0.6929 | 0.6299 | 0.6599 | 154 | |
|
|
| LOC | 0.7500 | 0.8969 | 0.8169 | 97 | |
|
|
| MISC | 0.4878 | 0.6316 | 0.5505 | 95 | |
|
|
| **micro avg** | **0.7260** | **0.7742** | **0.7493** | | |
|
|
|
|
|
## Entity Types |
|
|
|
|
|
- **PER**: Person names |
|
|
- **ORG**: Organizations |
|
|
- **LOC**: Locations |
|
|
- **MISC**: Miscellaneous entities |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
ner = pipeline("ner", model="thomasbeste/modernbert-da-ner-base", aggregation_strategy="simple") |
|
|
results = ner("Jens Peter Hansen bor i København og arbejder hos Novo Nordisk.") |
|
|
for entity in results: |
|
|
print(f"{entity['word']}: {entity['entity_group']} ({entity['score']:.3f})") |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Base model**: AI-Sweden-Models/ModernBERT-base |
|
|
- **Dataset**: DaNE (alexandrainst/dane) — 4,383 train / 564 val / 565 test sentences |
|
|
- **Epochs**: 10 |
|
|
- **Learning rate**: 2e-5 |
|
|
- **Batch size**: 16 |
|
|
- **Optimizer**: AdamW (weight decay 0.01, warmup ratio 0.1) |
|
|
- **Precision**: bf16 |
|
|
- **Max sequence length**: 256 |
|
|
|