NER BERT Fine-tuned on Persons, Organizations, and Locations

This model is a fine-tuned version of bert-base-cased on a custom dataset of news articles annotated with named entities of type Person, Organization, and Location. It is designed to identify these entities from raw article text.

Model Details

Model Description

Developed by: Saud Shakeel
Model type: Transformer-based BERT model
Language(s): English
License: Apache 2.0
Finetuned from model: bert-base-cased
Task: Named Entity Recognition (NER) for PER, ORG, and LOC

Model Sources

Repository: https://huggingface.co/Saud-Shakeel/ner-bert-finetuned-person-org-loc

Uses

Direct Use

Extracts Person, Organization, and Location entities from English news text.
Suitable for downstream NLP pipelines in journalism, content moderation, fact-checking, and search indexing.

How to Use

from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

model = AutoModelForTokenClassification.from_pretrained("Saud-Shakeel/ner-bert-finetuned-person-org-loc")
tokenizer = AutoTokenizer.from_pretrained("Saud-Shakeel/ner-bert-finetuned-person-org-loc")

nlp = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")

text = "President Joe Biden met with officials from the United Nations in New York."
entities = nlp(text)
print(entities)

Out-of-Scope Use

Not suitable for non-English text
Not designed to extract entities beyond PER, ORG, and LOC

Training Details

Training Data

Custom-labeled dataset of 643 news articles with persons, organizations, and locations fields
Multi-word entities preserved for exact span matching

Preprocessing

Articles were cleaned to extract main body text
Labels converted to BIO format per token using spaCy token boundaries

Training Procedure

Model: bert-base-cased
Optimizer: AdamW
Epochs: 5
Batch size: 8
Max length: 512 tokens
Trained using Hugging Face Trainer API with seqeval for evaluation

Evaluation

Testing Data & Metrics

Held-out portion of the labeled dataset
Evaluation metric: F1 Score, Precision, Recall (using seqeval)

Results (Sample)

Entity Type	Precision	Recall	F1 Score
PER	0.89	0.84	0.86
ORG	0.85	0.79	0.82
LOC	0.91	0.87	0.89

Environmental Impact

Hardware Type: NVIDIA Tesla T4
Training Time: ~10 minutes
Compute Region: Google Colab (us-central1)
Carbon Emitted: Low (< 0.05 kgCO2e, estimated via ML CO2 calculator)

Citation

@misc{saud2025ner,
  author = {Shakeel, Saud},
  title = {NER-BERT fine-tuned for Person, Organization, and Location Extraction},
  year = {2025},
  url = {https://huggingface.co/Saud-Shakeel/ner-bert-finetuned-person-org-loc},
  note = {Fine-tuned with Hugging Face Transformers}
}

Contact

Author: Muhammad Saud Shakeel
Email: saudshakeel.ml@gmail.com
HuggingFace Profile: https://huggingface.co/Saud-Shakeel

Downloads last month: 8

Safetensors

Model size

0.1B params

Tensor type

F32