Create README.md
Browse files## Model description
This model is a fine-tuned version of roberta-base for the Named Entity Recognition (NER) task using the CoNLL-2003 dataset. It can identify four types of entities: Persons (PER), Organizations (ORG), Locations (LOC), and Miscellaneous (MISC).
## Training procedure
* **Hardware:** NVIDIA V100 GPU
* **Optimizer:** AdamW
* **Learning Rate:** 2e-5
* **Batch Size:** 16
* **Weight Decay:** 0.01
* **Epochs:** 5
* **Mixed Precision Training:** FP16 enabled
## Evaluation Results
| Metric) | Value |
| :--- | :--- |
| **F1 Score** | **95.99%** |
| **Precision** | **95.61%** |
| **Recall** | **96.38%** |
| **Accuracy** | **99.29%** |
| **Eval Loss** | **0.0464** |
## How to use
```python
from transformers import pipeline
model_id = "learnrr/roberta-NER-conll2003"
text = "Apple is looking at buying U.K. startup for $1 billion"
results = nlp(text)
for entity in results:
print(f"entity: {entity['word']} | class: {entity['entity_group']} | confidence: {entity['score']:.4f}")
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: mit
|
| 4 |
+
base_model: roberta-base
|
| 5 |
+
tags:
|
| 6 |
+
- token-classification
|
| 7 |
+
- ner
|
| 8 |
+
- named-entity-recognition
|
| 9 |
+
datasets:
|
| 10 |
+
- conll2003
|
| 11 |
+
metrics:
|
| 12 |
+
- f1
|
| 13 |
+
- precision
|
| 14 |
+
- recall
|
| 15 |
+
- accuracy
|
| 16 |
+
model-index:
|
| 17 |
+
- name: RoBERTa-base-NER-CoNLL2003
|
| 18 |
+
results:
|
| 19 |
+
- task:
|
| 20 |
+
type: token-classification
|
| 21 |
+
name: Named Entity Recognition
|
| 22 |
+
dataset:
|
| 23 |
+
type: conll2003
|
| 24 |
+
name: CoNLL-2003 (English)
|
| 25 |
+
metrics:
|
| 26 |
+
- type: f1
|
| 27 |
+
value: 95.99
|
| 28 |
+
---
|