AventIQ-AI
/

Custom-BERT-NER-Model

Model card Files Files and versions

Custom-BERT-NER-Model / README.md

DeepakKumarMSL's picture

Create README.md

63f08fe verified 10 months ago

|

history blame contribute delete

2.18 kB

	# Custom BERT NER Model

	This repository contains a BERT-based Named Entity Recognition (NER) model fine-tuned on the CoNLL-2003 dataset. The model is trained to identify common named entity types such as persons, organizations, locations, and miscellaneous entities.

	---

	## Model Details

	- Model architecture: BERT (bert-base-cased)
	- Task: Token classification / Named Entity Recognition (NER)
	- Training data: CoNLL-2003 dataset (~14,000 training samples)
	- Number of epochs: 5
	- Framework: Hugging Face Transformers + Datasets
	- Device: CUDA-enabled GPU for training and inference
	- WandB: Disabled during training

	---

	## Usage

	You can use this model for token classification to identify named entities in your text.

	### Installation

	```python
	pip install transformers datasets torch
	```

	## Load the model and tokenizer

	```pyhton

	from transformers import BertTokenizerFast, BertForTokenClassification
	import torch

	model_name_or_path = "AventIQ-AI/Custom-BERT-NER-Model"

	tokenizer = BertTokenizerFast.from_pretrained(model_name_or_path)
	model = BertForTokenClassification.from_pretrained(model_name_or_path)

	model.to("cuda") # or "cpu"
	model.eval()
	```

	## Example inference

	```python

	text = "Hi, I am Deepak and I am living in Delhi."

	tokens = tokenizer(text, return_tensors="pt").to(model.device)
	outputs = model(**tokens)
	predictions = torch.argmax(outputs.logits, dim=2)

	labels = [model.config.id2label[p.item()] for p in predictions[0]]
	for token, label in zip(tokenizer.tokenize(text), labels):
	print(f"{token}: {label}")
	```

	## Training Details

	- Dataset: CoNLL-2003, loaded via the Hugging Face datasets library

	- Optimizer: AdamW

	- Learning Rate: 5e-5

	- Batch Size: 16

	- Max Sequence Length: 128

	- Epochs: 5

	- Evaluation: Performed on validation split (if applicable)

	- Quantization: Applied post-training for model size reduction (optional)

	## Limitations

	- The model may not generalize well to unseen entity types or domains outside CoNLL-2003.

	- It can occasionally mislabel entities, especially for rare or new names.

	- A CUDA-enabled GPU is required for efficient training and inference.