--- license: apache-2.0 tags: - generated_from_trainer - token-classification - ner - nlp datasets: - conll2003 language: - en pipeline_tag: token-classification library_name: transformers base_model: bert-base-uncased model-index: - name: token-classification-ai-fine-tune results: - task: type: token-classification name: Named Entity Recognition (NER) dataset: name: CoNLL-2003 type: conll2003 metrics: - name: Validation Loss type: loss value: 0.0474 widget: - text: "Apple is buying a U.K. startup for $1 billion" --- # token-classification-ai-fine-tune [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https://huggingface.co/bniladridas/token-classification-ai-fine-tune) This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset. It achieves a validation loss of **0.0474** on the evaluation set. ## Model Description This is a token classification model fine-tuned for **Named Entity Recognition (NER)**, built on the `bert-base-uncased` architecture. It’s crafted to identify entities (like people, organizations, and locations) in text, optimized here for CPU accessibility. Uploaded by [bniladridas](https://huggingface.co/bniladridas), it delivers strong NER performance on the CoNLL-2003 benchmark. For a GPU-accelerated version with CUDA support, see the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune). ## Intended Uses & Limitations ### Intended Uses - Extracting named entities from unstructured text (e.g., news articles, reports) - Powering NLP pipelines on CPU-based systems - Research or lightweight production use ### Limitations - Trained on English text from CoNLL-2003, so it may not generalize well to other languages or domains - Uses `bert-base-uncased` tokenization (lowercase-only), potentially missing case-sensitive nuances - Optimized for NER; additional tuning needed for other token-classification tasks ## Training and Evaluation Data The model was trained and evaluated on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset, a standard NER benchmark. It features annotated English news articles with entities like persons, organizations, and locations, split into training, validation, and test sets. Metrics here reflect the evaluation subset. ## Training Procedure ### Training Hyperparameters The following hyperparameters were used during training: - **learning_rate**: 2e-05 - **train_batch_size**: 8 - **eval_batch_size**: 8 - **seed**: 42 - **optimizer**: Adam with betas=(0.9,0.999) and epsilon=1e-08 - **lr_scheduler_type**: linear - **lr_scheduler_warmup_steps**: 500 - **num_epochs**: 3 ### Training Results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 0.048 | 1.0 | 1756 | 0.0531 | | 0.0251 | 2.0 | 3512 | 0.0473 | | 0.016 | 3.0 | 5268 | 0.0474 | ### Framework Versions - **Transformers**: 4.28.1 - **PyTorch**: 2.0.1 - **Datasets**: 1.18.3 - **Tokenizers**: 0.13.3 ### Additional Notes This version is optimized for CPU use with these intentional adjustments: 1. **Full-precision training**: Swapped out fp16 for broader compatibility 2. **Streamlined batch sizes**: Set to 8 for efficient CPU processing 3. **Simplified workflow**: Skipped gradient accumulation for smoother CPU runs 4. **Full feature set**: Retained all monitoring (e.g., TensorBoard) and saving capabilities For the GPU version with CUDA, mixed precision, and gradient accumulation, check out the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune). To clone it, run: ```bash git clone https://github.com/bniladridas/token-classification-ai-fine-tune.git ``` This model was pushed to the Hugging Face Hub for easy CPU-based deployment.