---
license: apache-2.0
tags:
  - generated_from_trainer
  - token-classification
  - ner
  - nlp
datasets:
  - conll2003
language:
  - en
pipeline_tag: token-classification
library_name: transformers
base_model: bert-base-uncased
model-index:
  - name: token-classification-ai-fine-tune
    results:
      - task:
          type: token-classification
          name: Named Entity Recognition (NER)
        dataset:
          name: CoNLL-2003
          type: conll2003
        metrics:
          - name: Validation Loss
            type: loss
            value: 0.0474
widget:
  - text: "Apple is buying a U.K. startup for $1 billion"
---

# token-classification-ai-fine-tune

[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https://huggingface.co/bniladridas/token-classification-ai-fine-tune)

This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset. It achieves a validation loss of **0.0474** on the evaluation set.

## Model Description

This is a token classification model fine-tuned for **Named Entity Recognition (NER)**, built on the `bert-base-uncased` architecture. It’s crafted to identify entities (like people, organizations, and locations) in text, optimized here for CPU accessibility. Uploaded by [bniladridas](https://huggingface.co/bniladridas), it delivers strong NER performance on the CoNLL-2003 benchmark. For a GPU-accelerated version with CUDA support, see the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune).

## Intended Uses & Limitations

### Intended Uses
- Extracting named entities from unstructured text (e.g., news articles, reports)
- Powering NLP pipelines on CPU-based systems
- Research or lightweight production use

### Limitations
- Trained on English text from CoNLL-2003, so it may not generalize well to other languages or domains
- Uses `bert-base-uncased` tokenization (lowercase-only), potentially missing case-sensitive nuances
- Optimized for NER; additional tuning needed for other token-classification tasks

## Training and Evaluation Data

The model was trained and evaluated on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset, a standard NER benchmark. It features annotated English news articles with entities like persons, organizations, and locations, split into training, validation, and test sets. Metrics here reflect the evaluation subset.

## Training Procedure

### Training Hyperparameters

The following hyperparameters were used during training:
- **learning_rate**: 2e-05
- **train_batch_size**: 8
- **eval_batch_size**: 8
- **seed**: 42
- **optimizer**: Adam with betas=(0.9,0.999) and epsilon=1e-08
- **lr_scheduler_type**: linear
- **lr_scheduler_warmup_steps**: 500
- **num_epochs**: 3

### Training Results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.048         | 1.0   | 1756 | 0.0531          |
| 0.0251        | 2.0   | 3512 | 0.0473          |
| 0.016         | 3.0   | 5268 | 0.0474          |

### Framework Versions

- **Transformers**: 4.28.1
- **PyTorch**: 2.0.1
- **Datasets**: 1.18.3
- **Tokenizers**: 0.13.3

### Additional Notes
This version is optimized for CPU use with these intentional adjustments:
1. **Full-precision training**: Swapped out fp16 for broader compatibility
2. **Streamlined batch sizes**: Set to 8 for efficient CPU processing
3. **Simplified workflow**: Skipped gradient accumulation for smoother CPU runs
4. **Full feature set**: Retained all monitoring (e.g., TensorBoard) and saving capabilities

For the GPU version with CUDA, mixed precision, and gradient accumulation, check out the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune). To clone it, run:

```bash
git clone https://github.com/bniladridas/token-classification-ai-fine-tune.git
```

This model was pushed to the Hugging Face Hub for easy CPU-based deployment.