metadata
language: en
license: apache-2.0
base_model: bert-base-cased
tags:
- bert
- token-classification
- ner
- conll2003
datasets:
- conll2003
metrics:
- seqeval
pipeline_tag: token-classification
BERT fine-tuned on CoNLL-2003 (NER)
bert-base-cased fine-tuned for Named Entity Recognition on CoNLL-2003.
Recognizes 4 entity types: PER, ORG, LOC, MISC.
Evaluation results
| Metric | Score |
|---|---|
| Precision | 0.7058 |
| Recall | 0.5080 |
| F1 | 0.5908 |
| Accuracy | 0.9015 |
Evaluated with seqeval on the CoNLL-2003 test split.
Usage
from transformers import pipeline
ner = pipeline("ner", model="ZaharHR/bert-conll2003-ner", aggregation_strategy="simple")
ner("Elon Musk founded SpaceX in California.")
Training details
- Base model:
bert-base-cased - Dataset: CoNLL-2003
- Epochs: 1
- Effective batch size: 16 (gradient accumulation)
- Optimizer: AdamW, weight decay 0.01
- Warmup steps: 500
Label scheme
O, B-PER, I-PER, B-ORG, I-ORG, B-LOC, I-LOC, B-MISC, I-MISC