--- language: en license: apache-2.0 base_model: bert-base-cased tags: - bert - token-classification - ner - conll2003 datasets: - conll2003 metrics: - seqeval pipeline_tag: token-classification --- # BERT fine-tuned on CoNLL-2003 (NER) `bert-base-cased` fine-tuned for Named Entity Recognition on [CoNLL-2003](https://huggingface.co/datasets/conll2003). Recognizes 4 entity types: **PER**, **ORG**, **LOC**, **MISC**. ## Evaluation results | Metric | Score | |-----------|--------| | Precision | 0.7058 | | Recall | 0.5080 | | F1 | 0.5908 | | Accuracy | 0.9015 | Evaluated with [seqeval](https://github.com/chakki-works/seqeval) on the CoNLL-2003 test split. ## Usage ```python from transformers import pipeline ner = pipeline("ner", model="ZaharHR/bert-conll2003-ner", aggregation_strategy="simple") ner("Elon Musk founded SpaceX in California.") ``` ## Training details - **Base model:** `bert-base-cased` - **Dataset:** CoNLL-2003 - **Epochs:** 1 - **Effective batch size:** 16 (gradient accumulation) - **Optimizer:** AdamW, weight decay 0.01 - **Warmup steps:** 500 ## Label scheme ``` O, B-PER, I-PER, B-ORG, I-ORG, B-LOC, I-LOC, B-MISC, I-MISC ```