library_name: transformers tags:
- ner
- bert
- token-classification
- huggingface
- news-article
- named-entity-recognition
NER BERT Fine-tuned on Persons, Organizations, and Locations
This model is a fine-tuned version of bert-base-cased on a custom dataset of news articles annotated with named entities of type Person, Organization, and Location. It is designed to identify these entities from raw article text.
Model Details
Model Description
- Developed by: Saud Shakeel
- Model type: Transformer-based BERT model
- Language(s): English
- License: Apache 2.0
- Finetuned from model: bert-base-cased
- Task: Named Entity Recognition (NER) for
PER,ORG, andLOC
Model Sources
Uses
Direct Use
- Extracts
Person,Organization, andLocationentities from English news text. - Suitable for downstream NLP pipelines in journalism, content moderation, fact-checking, and search indexing.
How to Use
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
model = AutoModelForTokenClassification.from_pretrained("Saud-Shakeel/ner-bert-finetuned-person-org-loc")
tokenizer = AutoTokenizer.from_pretrained("Saud-Shakeel/ner-bert-finetuned-person-org-loc")
nlp = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
text = "President Joe Biden met with officials from the United Nations in New York."
entities = nlp(text)
print(entities)
Out-of-Scope Use
- Not suitable for non-English text
- Not designed to extract entities beyond
PER,ORG, andLOC
Training Details
Training Data
- Custom-labeled dataset of 643 news articles with
persons,organizations, andlocationsfields - Multi-word entities preserved for exact span matching
Preprocessing
- Articles were cleaned to extract main body text
- Labels converted to BIO format per token using spaCy token boundaries
Training Procedure
- Model:
bert-base-cased - Optimizer: AdamW
- Epochs: 5
- Batch size: 8
- Max length: 512 tokens
- Trained using Hugging Face
TrainerAPI withseqevalfor evaluation
Evaluation
Testing Data & Metrics
- Held-out portion of the labeled dataset
- Evaluation metric: F1 Score, Precision, Recall (using
seqeval)
Results (Sample)
| Entity Type | Precision | Recall | F1 Score |
|---|---|---|---|
| PER | 0.89 | 0.84 | 0.86 |
| ORG | 0.85 | 0.79 | 0.82 |
| LOC | 0.91 | 0.87 | 0.89 |
Environmental Impact
- Hardware Type: NVIDIA Tesla T4
- Training Time: ~10 minutes
- Compute Region: Google Colab (us-central1)
- Carbon Emitted: Low (< 0.05 kgCO2e, estimated via ML CO2 calculator)
Citation
@misc{saud2025ner,
author = {Shakeel, Saud},
title = {NER-BERT fine-tuned for Person, Organization, and Location Extraction},
year = {2025},
url = {https://huggingface.co/Saud-Shakeel/ner-bert-finetuned-person-org-loc},
note = {Fine-tuned with Hugging Face Transformers}
}
Contact
- Author: Muhammad Saud Shakeel
- Email: saudshakeel.ml@gmail.com
- HuggingFace Profile: https://huggingface.co/Saud-Shakeel
- Downloads last month
- 8