harpertokenNER / README.md
harpertoken's picture
Update README.md
71dcd83 verified
---
license: apache-2.0
tags:
- generated_from_trainer
- token-classification
- ner
- nlp
datasets:
- conll2003
language:
- en
pipeline_tag: token-classification
library_name: transformers
base_model: bert-base-uncased
model-index:
- name: harpertokenNER
results:
- task:
type: token-classification
name: Named Entity Recognition (NER)
dataset:
name: CoNLL-2003
type: conll2003
metrics:
- name: Validation Loss
type: loss
value: 0.0474
widget:
- text: "Apple is buying a U.K. startup for $1 billion"
---
# harpertokenNER
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [CoNLL-2003](https://huggingface.co/datasets/eriktks/conll2003) dataset. It achieves a validation loss of **0.0474** on the evaluation set.
## Model Description
This is a token classification model fine-tuned for **Named Entity Recognition (NER)** on the CoNLL-2003 dataset, built on the `bert-base-uncased` architecture. It identifies entities like people, organizations, and locations in text. Optimized for CPU use. Uploaded by [harpertoken](https://huggingface.co/harpertoken).
## Intended Uses & Limitations
### Intended Uses
- Extracting named entities from unstructured text (e.g., news articles, reports)
- Powering NLP pipelines on CPU-based systems
- Research or lightweight production use
### Limitations
- Trained on English text from CoNLL-2003, so it may not generalize well to other languages or domains
- Uses `bert-base-uncased` tokenization (lowercase-only), potentially missing case-sensitive nuances
- Optimized for NER; additional tuning needed for other token-classification tasks
## Training and Evaluation Data
The model was trained and evaluated on the *CoNLL-2003 dataset*, a standard NER benchmark. It features annotated English news articles with entities like persons, organizations, and locations, split into training, validation, and test sets. Metrics here reflect the evaluation subset.
## Training Procedure
### Training Hyperparameters
The following hyperparameters were used during training:
- **learning_rate**: 2e-05
- **train_batch_size**: 8
- **eval_batch_size**: 8
- **seed**: 42
- **optimizer**: Adam with betas=(0.9,0.999) and epsilon=1e-08
- **lr_scheduler_type**: linear
- **lr_scheduler_warmup_steps**: 500
- **num_epochs**: 3
### Training Results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.048 | 1.0 | 1756 | 0.0531 |
| 0.0251 | 2.0 | 3512 | 0.0473 |
| 0.016 | 3.0 | 5268 | 0.0474 |
### Framework Versions
- **Transformers**: 4.28.1
- **PyTorch**: 2.0.1
- **Datasets**: 1.18.3
- **Tokenizers**: 0.13.3