harpertokenNER / README.md
Niladri Das
Metadata Enhancements:
6b3acc4 verified
|
raw
history blame
4.05 kB
---
license: apache-2.0
tags:
- generated_from_trainer
- token-classification
- ner
- nlp
datasets:
- conll2003
language:
- en
pipeline_tag: token-classification
library_name: transformers
base_model: bert-base-uncased
model-index:
- name: token-classification-ai-fine-tune
results:
- task:
type: token-classification
name: Named Entity Recognition (NER)
dataset:
name: CoNLL-2003
type: conll2003
metrics:
- name: Validation Loss
type: loss
value: 0.0474
widget:
- text: "Apple is buying a U.K. startup for $1 billion"
---
# token-classification-ai-fine-tune
[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https://huggingface.co/bniladridas/token-classification-ai-fine-tune)
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset. It achieves a validation loss of **0.0474** on the evaluation set.
## Model Description
This is a token classification model fine-tuned for **Named Entity Recognition (NER)**, built on the `bert-base-uncased` architecture. It’s crafted to identify entities (like people, organizations, and locations) in text, optimized here for CPU accessibility. Uploaded by [bniladridas](https://huggingface.co/bniladridas), it delivers strong NER performance on the CoNLL-2003 benchmark. For a GPU-accelerated version with CUDA support, see the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune).
## Intended Uses & Limitations
### Intended Uses
- Extracting named entities from unstructured text (e.g., news articles, reports)
- Powering NLP pipelines on CPU-based systems
- Research or lightweight production use
### Limitations
- Trained on English text from CoNLL-2003, so it may not generalize well to other languages or domains
- Uses `bert-base-uncased` tokenization (lowercase-only), potentially missing case-sensitive nuances
- Optimized for NER; additional tuning needed for other token-classification tasks
## Training and Evaluation Data
The model was trained and evaluated on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset, a standard NER benchmark. It features annotated English news articles with entities like persons, organizations, and locations, split into training, validation, and test sets. Metrics here reflect the evaluation subset.
## Training Procedure
### Training Hyperparameters
The following hyperparameters were used during training:
- **learning_rate**: 2e-05
- **train_batch_size**: 8
- **eval_batch_size**: 8
- **seed**: 42
- **optimizer**: Adam with betas=(0.9,0.999) and epsilon=1e-08
- **lr_scheduler_type**: linear
- **lr_scheduler_warmup_steps**: 500
- **num_epochs**: 3
### Training Results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.048 | 1.0 | 1756 | 0.0531 |
| 0.0251 | 2.0 | 3512 | 0.0473 |
| 0.016 | 3.0 | 5268 | 0.0474 |
### Framework Versions
- **Transformers**: 4.28.1
- **PyTorch**: 2.0.1
- **Datasets**: 1.18.3
- **Tokenizers**: 0.13.3
### Additional Notes
This version is optimized for CPU use with these intentional adjustments:
1. **Full-precision training**: Swapped out fp16 for broader compatibility
2. **Streamlined batch sizes**: Set to 8 for efficient CPU processing
3. **Simplified workflow**: Skipped gradient accumulation for smoother CPU runs
4. **Full feature set**: Retained all monitoring (e.g., TensorBoard) and saving capabilities
For the GPU version with CUDA, mixed precision, and gradient accumulation, check out the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune). To clone it, run:
```bash
git clone https://github.com/bniladridas/token-classification-ai-fine-tune.git
```
This model was pushed to the Hugging Face Hub for easy CPU-based deployment.