--- tags: - transformers - token-classification - ner - bert - conll2003 license: apache-2.0 datasets: - conll2003 language: - en pipeline_tag: token-classification authors: - Karan D Vasa (https://huggingface.co/starkdv123) --- # BERT (base-cased) for CoNLL-2003 NER — Full Fine-Tune This repository contains a **BERT base cased** model fine-tuned on **CoNLL-2003** (parquet version). Evaluated with **seqeval** (entity-level F1). ## 📊 Result (this run) - **Entity Macro F1**: 0.9192 ## Usage ```python from transformers import pipeline clf = pipeline("token-classification", model="starkdv123/conll2003-bert-ner-full", aggregation_strategy="simple") clf("Chris Hoiles hit his 22nd homer for Baltimore.") ``` ## Training summary * Base: `bert-base-cased` * Epochs: 3, LR: 3e-5, batch 16/32, max_len 256, weight_decay 0.01, fp16 * Label alignment: -100 for subword continuations * Metric: seqeval F1 (entity-level) ## Confusion Matrix ``` LOC MISC O ORG PER LOC 411 6 21 32 3 MISC 9 2213 51 76 14 O 67 110 38063 58 17 ORG 31 77 32 2353 10 PER 3 42 15 24 2689 ```