check
Browse files
README.md
CHANGED
|
@@ -1,3 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Named_Entity_Recognition
|
| 2 |
|
| 3 |
### Custom Named Entity Recognition (NER) Model for Azerbaijani Language
|
|
@@ -36,10 +40,11 @@ You can try out the deployed model here: [Named Entity Recognition Demo](https:/
|
|
| 36 |
- **Dataset**: [Azerbaijani NER Dataset](https://huggingface.co/datasets/LocalDoc/azerbaijani-ner-dataset)
|
| 37 |
- **mBERT Model**: [mBERT Azerbaijani NER](https://huggingface.co/IsmatS/mbert-az-ner)
|
| 38 |
- **XLM-RoBERTa Model**: [XLM-RoBERTa Azerbaijani NER](https://huggingface.co/IsmatS/xlm-roberta-az-ner)
|
|
|
|
| 39 |
|
| 40 |
Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized training performance.
|
| 41 |
|
| 42 |
-
**Note**: Due to its superior performance, the XLM-RoBERTa model was selected for deployment.
|
| 43 |
|
| 44 |
## Model Performance Metrics
|
| 45 |
|
|
@@ -51,7 +56,7 @@ Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized
|
|
| 51 |
| 2 | 0.248600 | 0.252083 | 0.721036 | 0.637979 | 0.676970 | 0.921439 |
|
| 52 |
| 3 | 0.206800 | 0.253372 | 0.704872 | 0.650684 | 0.676695 | 0.920898 |
|
| 53 |
|
| 54 |
-
### XLM-RoBERTa Model
|
| 55 |
|
| 56 |
| Epoch | Training Loss | Validation Loss | Precision | Recall | F1 |
|
| 57 |
|-------|---------------|----------------|-----------|----------|----------|
|
|
@@ -63,6 +68,41 @@ Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized
|
|
| 63 |
| 6 | 0.218600 | 0.249887 | 0.756352 | 0.741646 | 0.748927 |
|
| 64 |
| 7 | 0.209700 | 0.250748 | 0.760696 | 0.739438 | 0.749916 |
|
| 65 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
## Setup and Usage
|
| 67 |
|
| 68 |
1. **Clone the repository**:
|
|
@@ -74,7 +114,9 @@ Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized
|
|
| 74 |
2. **Create and activate a virtual environment**:
|
| 75 |
```bash
|
| 76 |
python3 -m venv .venv
|
| 77 |
-
source .venv/bin/activate
|
|
|
|
|
|
|
| 78 |
```
|
| 79 |
|
| 80 |
3. **Install dependencies**:
|
|
@@ -140,4 +182,4 @@ Access your deployed app at the Fly.io-provided URL (e.g., `https://your-app-nam
|
|
| 140 |
|
| 141 |
Access the web interface through the Fly.io URL or `http://localhost:8080` (if running locally) to test the NER model and view recognized entities.
|
| 142 |
|
| 143 |
-
This application leverages the XLM-RoBERTa model fine-tuned on Azerbaijani language data for high-accuracy named entity recognition.
|
|
|
|
| 1 |
+
Here’s the updated README with the additional **XLM-RoBERTa Large Model** metrics section added.
|
| 2 |
+
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
# Named_Entity_Recognition
|
| 6 |
|
| 7 |
### Custom Named Entity Recognition (NER) Model for Azerbaijani Language
|
|
|
|
| 40 |
- **Dataset**: [Azerbaijani NER Dataset](https://huggingface.co/datasets/LocalDoc/azerbaijani-ner-dataset)
|
| 41 |
- **mBERT Model**: [mBERT Azerbaijani NER](https://huggingface.co/IsmatS/mbert-az-ner)
|
| 42 |
- **XLM-RoBERTa Model**: [XLM-RoBERTa Azerbaijani NER](https://huggingface.co/IsmatS/xlm-roberta-az-ner)
|
| 43 |
+
- **XLM-RoBERTa Large Model**: [XLM-RoBERTa Large Azerbaijani NER](https://huggingface.co/IsmatS/xlm-roberta-large-az-ner)
|
| 44 |
|
| 45 |
Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized training performance.
|
| 46 |
|
| 47 |
+
**Note**: Due to its superior performance, the XLM-RoBERTa Large model was selected for deployment.
|
| 48 |
|
| 49 |
## Model Performance Metrics
|
| 50 |
|
|
|
|
| 56 |
| 2 | 0.248600 | 0.252083 | 0.721036 | 0.637979 | 0.676970 | 0.921439 |
|
| 57 |
| 3 | 0.206800 | 0.253372 | 0.704872 | 0.650684 | 0.676695 | 0.920898 |
|
| 58 |
|
| 59 |
+
### XLM-RoBERTa Base Model
|
| 60 |
|
| 61 |
| Epoch | Training Loss | Validation Loss | Precision | Recall | F1 |
|
| 62 |
|-------|---------------|----------------|-----------|----------|----------|
|
|
|
|
| 68 |
| 6 | 0.218600 | 0.249887 | 0.756352 | 0.741646 | 0.748927 |
|
| 69 |
| 7 | 0.209700 | 0.250748 | 0.760696 | 0.739438 | 0.749916 |
|
| 70 |
|
| 71 |
+
### XLM-RoBERTa Large Model
|
| 72 |
+
|
| 73 |
+
| Epoch | Training Loss | Validation Loss | Precision | Recall | F1 |
|
| 74 |
+
|-------|---------------|----------------|-----------|----------|----------|
|
| 75 |
+
| 1 | 0.407500 | 0.253823 | 0.768923 | 0.721350 | 0.744377 |
|
| 76 |
+
| 2 | 0.255600 | 0.249694 | 0.783549 | 0.724464 | 0.752849 |
|
| 77 |
+
| 3 | 0.214400 | 0.248773 | 0.750857 | 0.748900 | 0.749877 |
|
| 78 |
+
| 4 | 0.193400 | 0.257051 | 0.768623 | 0.740371 | 0.754232 |
|
| 79 |
+
| 5 | 0.169800 | 0.275679 | 0.745789 | 0.753740 | 0.749743 |
|
| 80 |
+
| 6 | 0.152600 | 0.288074 | 0.783131 | 0.728423 | 0.754787 |
|
| 81 |
+
| 7 | 0.144300 | 0.303378 | 0.758504 | 0.738069 | 0.748147 |
|
| 82 |
+
| 8 | 0.126800 | 0.311300 | 0.745589 | 0.750863 | 0.748217 |
|
| 83 |
+
| 9 | 0.119400 | 0.331631 | 0.739316 | 0.749475 | 0.744361 |
|
| 84 |
+
| 10 | 0.109400 | 0.344823 | 0.754268 | 0.737189 | 0.745631 |
|
| 85 |
+
| 11 | 0.102900 | 0.354887 | 0.751948 | 0.741285 | 0.746578 |
|
| 86 |
+
|
| 87 |
+
### Detailed Metrics for XLM-RoBERTa Large Model
|
| 88 |
+
|
| 89 |
+
| Entity | Precision | Recall | F1-score | Support |
|
| 90 |
+
|--------------|-----------|--------|----------|---------|
|
| 91 |
+
| ART | 0.41 | 0.19 | 0.26 | 1828 |
|
| 92 |
+
| DATE | 0.53 | 0.49 | 0.51 | 834 |
|
| 93 |
+
| EVENT | 0.67 | 0.51 | 0.58 | 63 |
|
| 94 |
+
| FACILITY | 0.74 | 0.68 | 0.71 | 1134 |
|
| 95 |
+
| LAW | 0.62 | 0.58 | 0.60 | 1066 |
|
| 96 |
+
| LOCATION | 0.81 | 0.79 | 0.80 | 8795 |
|
| 97 |
+
| MONEY | 0.59 | 0.56 | 0.58 | 555 |
|
| 98 |
+
| ORGANISATION | 0.70 | 0.69 | 0.70 | 554 |
|
| 99 |
+
| PERCENTAGE | 0.80 | 0.82 | 0.81 | 3502 |
|
| 100 |
+
| PERSON | 0.90 | 0.82 | 0.86 | 7007 |
|
| 101 |
+
| PRODUCT | 0.83 | 0.84 | 0.84 | 2624 |
|
| 102 |
+
| TIME | 0.60 | 0.53 | 0.57 | 1584 |
|
| 103 |
+
|
| 104 |
+
---
|
| 105 |
+
|
| 106 |
## Setup and Usage
|
| 107 |
|
| 108 |
1. **Clone the repository**:
|
|
|
|
| 114 |
2. **Create and activate a virtual environment**:
|
| 115 |
```bash
|
| 116 |
python3 -m venv .venv
|
| 117 |
+
source .venv/bin/activate
|
| 118 |
+
|
| 119 |
+
# On Windows use: .venv\Scripts\activate
|
| 120 |
```
|
| 121 |
|
| 122 |
3. **Install dependencies**:
|
|
|
|
| 182 |
|
| 183 |
Access the web interface through the Fly.io URL or `http://localhost:8080` (if running locally) to test the NER model and view recognized entities.
|
| 184 |
|
| 185 |
+
This application leverages the XLM-RoBERTa Large model fine-tuned on Azerbaijani language data for high-accuracy named entity recognition.
|