library_name: transformers tags:

  • ner
  • bert
  • token-classification
  • huggingface
  • news-article
  • named-entity-recognition

NER BERT Fine-tuned on Persons, Organizations, and Locations

This model is a fine-tuned version of bert-base-cased on a custom dataset of news articles annotated with named entities of type Person, Organization, and Location. It is designed to identify these entities from raw article text.


Model Details

Model Description

  • Developed by: Saud Shakeel
  • Model type: Transformer-based BERT model
  • Language(s): English
  • License: Apache 2.0
  • Finetuned from model: bert-base-cased
  • Task: Named Entity Recognition (NER) for PER, ORG, and LOC

Model Sources


Uses

Direct Use

  • Extracts Person, Organization, and Location entities from English news text.
  • Suitable for downstream NLP pipelines in journalism, content moderation, fact-checking, and search indexing.

How to Use

from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

model = AutoModelForTokenClassification.from_pretrained("Saud-Shakeel/ner-bert-finetuned-person-org-loc")
tokenizer = AutoTokenizer.from_pretrained("Saud-Shakeel/ner-bert-finetuned-person-org-loc")

nlp = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")

text = "President Joe Biden met with officials from the United Nations in New York."
entities = nlp(text)
print(entities)

Out-of-Scope Use

  • Not suitable for non-English text
  • Not designed to extract entities beyond PER, ORG, and LOC

Training Details

Training Data

  • Custom-labeled dataset of 643 news articles with persons, organizations, and locations fields
  • Multi-word entities preserved for exact span matching

Preprocessing

  • Articles were cleaned to extract main body text
  • Labels converted to BIO format per token using spaCy token boundaries

Training Procedure

  • Model: bert-base-cased
  • Optimizer: AdamW
  • Epochs: 5
  • Batch size: 8
  • Max length: 512 tokens
  • Trained using Hugging Face Trainer API with seqeval for evaluation

Evaluation

Testing Data & Metrics

  • Held-out portion of the labeled dataset
  • Evaluation metric: F1 Score, Precision, Recall (using seqeval)

Results (Sample)

Entity Type Precision Recall F1 Score
PER 0.89 0.84 0.86
ORG 0.85 0.79 0.82
LOC 0.91 0.87 0.89

Environmental Impact

  • Hardware Type: NVIDIA Tesla T4
  • Training Time: ~10 minutes
  • Compute Region: Google Colab (us-central1)
  • Carbon Emitted: Low (< 0.05 kgCO2e, estimated via ML CO2 calculator)

Citation

@misc{saud2025ner,
  author = {Shakeel, Saud},
  title = {NER-BERT fine-tuned for Person, Organization, and Location Extraction},
  year = {2025},
  url = {https://huggingface.co/Saud-Shakeel/ner-bert-finetuned-person-org-loc},
  note = {Fine-tuned with Hugging Face Transformers}
}

Contact

Downloads last month
8
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support