ZaharHR
/

bert-conll2003-ner

Token Classification

Model card Files Files and versions

bert-conll2003-ner / README.md

ZaharHR's picture

Upload README.md with huggingface_hub

714ef4d verified about 1 month ago

|

history blame contribute delete

1.2 kB

	---
	language: en
	license: apache-2.0
	base_model: bert-base-cased
	tags:
	- bert
	- token-classification
	- ner
	- conll2003
	datasets:
	- conll2003
	metrics:
	- seqeval
	pipeline_tag: token-classification
	---

	# BERT fine-tuned on CoNLL-2003 (NER)

	`bert-base-cased` fine-tuned for Named Entity Recognition on [CoNLL-2003](https://huggingface.co/datasets/conll2003).

	Recognizes 4 entity types: PER, ORG, LOC, MISC.

	## Evaluation results

	\| Metric \| Score \|
	\|-----------\|--------\|
	\| Precision \| 0.7058 \|
	\| Recall \| 0.5080 \|
	\| F1 \| 0.5908 \|
	\| Accuracy \| 0.9015 \|

	Evaluated with [seqeval](https://github.com/chakki-works/seqeval) on the CoNLL-2003 test split.

	## Usage

	```python
	from transformers import pipeline

	ner = pipeline("ner", model="ZaharHR/bert-conll2003-ner", aggregation_strategy="simple")
	ner("Elon Musk founded SpaceX in California.")
	```

	## Training details

	- Base model: `bert-base-cased`
	- Dataset: CoNLL-2003
	- Epochs: 1
	- Effective batch size: 16 (gradient accumulation)
	- Optimizer: AdamW, weight decay 0.01
	- Warmup steps: 500

	## Label scheme

	```
	O, B-PER, I-PER, B-ORG, I-ORG, B-LOC, I-LOC, B-MISC, I-MISC
	```