--- language: en license: mit base_model: roberta-base tags: - token-classification - ner - named-entity-recognition datasets: - conll2003 metrics: - f1 - precision - recall - accuracy model-index: - name: RoBERTa-base-NER-CoNLL2003 results: - task: type: token-classification name: Named Entity Recognition dataset: type: conll2003 name: CoNLL-2003 (English) metrics: - type: f1 value: 95.99 --- ## Model description This model is a fine-tuned version of roberta-base for the Named Entity Recognition (NER) task using the CoNLL-2003 dataset. It can identify four types of entities: Persons (PER), Organizations (ORG), Locations (LOC), and Miscellaneous (MISC). ## Training procedure * **Hardware:** NVIDIA V100 GPU * **Optimizer:** AdamW * **Learning Rate:** 2e-5 * **Batch Size:** 16 * **Weight Decay:** 0.01 * **Epochs:** 5 * **Mixed Precision Training:** FP16 enabled ## Evaluation Results | Metric) | Value | | :--- | :--- | | **F1 Score** | **95.99%** | | **Precision** | **95.61%** | | **Recall** | **96.38%** | | **Accuracy** | **99.29%** | | **Eval Loss** | **0.0464** | ## How to use ```python from transformers import pipeline model_id = "learnrr/roberta-NER-conll2003" text = "Apple is looking at buying U.K. startup for $1 billion" results = nlp(text) for entity in results: print(f"entity: {entity['word']} | class: {entity['entity_group']} | confidence: {entity['score']:.4f}")