IsmatS commited on
Commit
599147d
·
1 Parent(s): 7a008a0
Files changed (1) hide show
  1. README.md +46 -4
README.md CHANGED
@@ -1,3 +1,7 @@
 
 
 
 
1
  # Named_Entity_Recognition
2
 
3
  ### Custom Named Entity Recognition (NER) Model for Azerbaijani Language
@@ -36,10 +40,11 @@ You can try out the deployed model here: [Named Entity Recognition Demo](https:/
36
  - **Dataset**: [Azerbaijani NER Dataset](https://huggingface.co/datasets/LocalDoc/azerbaijani-ner-dataset)
37
  - **mBERT Model**: [mBERT Azerbaijani NER](https://huggingface.co/IsmatS/mbert-az-ner)
38
  - **XLM-RoBERTa Model**: [XLM-RoBERTa Azerbaijani NER](https://huggingface.co/IsmatS/xlm-roberta-az-ner)
 
39
 
40
  Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized training performance.
41
 
42
- **Note**: Due to its superior performance, the XLM-RoBERTa model was selected for deployment.
43
 
44
  ## Model Performance Metrics
45
 
@@ -51,7 +56,7 @@ Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized
51
  | 2 | 0.248600 | 0.252083 | 0.721036 | 0.637979 | 0.676970 | 0.921439 |
52
  | 3 | 0.206800 | 0.253372 | 0.704872 | 0.650684 | 0.676695 | 0.920898 |
53
 
54
- ### XLM-RoBERTa Model
55
 
56
  | Epoch | Training Loss | Validation Loss | Precision | Recall | F1 |
57
  |-------|---------------|----------------|-----------|----------|----------|
@@ -63,6 +68,41 @@ Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized
63
  | 6 | 0.218600 | 0.249887 | 0.756352 | 0.741646 | 0.748927 |
64
  | 7 | 0.209700 | 0.250748 | 0.760696 | 0.739438 | 0.749916 |
65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
  ## Setup and Usage
67
 
68
  1. **Clone the repository**:
@@ -74,7 +114,9 @@ Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized
74
  2. **Create and activate a virtual environment**:
75
  ```bash
76
  python3 -m venv .venv
77
- source .venv/bin/activate # On Windows use: .venv\Scripts\activate
 
 
78
  ```
79
 
80
  3. **Install dependencies**:
@@ -140,4 +182,4 @@ Access your deployed app at the Fly.io-provided URL (e.g., `https://your-app-nam
140
 
141
  Access the web interface through the Fly.io URL or `http://localhost:8080` (if running locally) to test the NER model and view recognized entities.
142
 
143
- This application leverages the XLM-RoBERTa model fine-tuned on Azerbaijani language data for high-accuracy named entity recognition.
 
1
+ Here’s the updated README with the additional **XLM-RoBERTa Large Model** metrics section added.
2
+
3
+ ---
4
+
5
  # Named_Entity_Recognition
6
 
7
  ### Custom Named Entity Recognition (NER) Model for Azerbaijani Language
 
40
  - **Dataset**: [Azerbaijani NER Dataset](https://huggingface.co/datasets/LocalDoc/azerbaijani-ner-dataset)
41
  - **mBERT Model**: [mBERT Azerbaijani NER](https://huggingface.co/IsmatS/mbert-az-ner)
42
  - **XLM-RoBERTa Model**: [XLM-RoBERTa Azerbaijani NER](https://huggingface.co/IsmatS/xlm-roberta-az-ner)
43
+ - **XLM-RoBERTa Large Model**: [XLM-RoBERTa Large Azerbaijani NER](https://huggingface.co/IsmatS/xlm-roberta-large-az-ner)
44
 
45
  Both models were fine-tuned on a premium A100 GPU in Google Colab for optimized training performance.
46
 
47
+ **Note**: Due to its superior performance, the XLM-RoBERTa Large model was selected for deployment.
48
 
49
  ## Model Performance Metrics
50
 
 
56
  | 2 | 0.248600 | 0.252083 | 0.721036 | 0.637979 | 0.676970 | 0.921439 |
57
  | 3 | 0.206800 | 0.253372 | 0.704872 | 0.650684 | 0.676695 | 0.920898 |
58
 
59
+ ### XLM-RoBERTa Base Model
60
 
61
  | Epoch | Training Loss | Validation Loss | Precision | Recall | F1 |
62
  |-------|---------------|----------------|-----------|----------|----------|
 
68
  | 6 | 0.218600 | 0.249887 | 0.756352 | 0.741646 | 0.748927 |
69
  | 7 | 0.209700 | 0.250748 | 0.760696 | 0.739438 | 0.749916 |
70
 
71
+ ### XLM-RoBERTa Large Model
72
+
73
+ | Epoch | Training Loss | Validation Loss | Precision | Recall | F1 |
74
+ |-------|---------------|----------------|-----------|----------|----------|
75
+ | 1 | 0.407500 | 0.253823 | 0.768923 | 0.721350 | 0.744377 |
76
+ | 2 | 0.255600 | 0.249694 | 0.783549 | 0.724464 | 0.752849 |
77
+ | 3 | 0.214400 | 0.248773 | 0.750857 | 0.748900 | 0.749877 |
78
+ | 4 | 0.193400 | 0.257051 | 0.768623 | 0.740371 | 0.754232 |
79
+ | 5 | 0.169800 | 0.275679 | 0.745789 | 0.753740 | 0.749743 |
80
+ | 6 | 0.152600 | 0.288074 | 0.783131 | 0.728423 | 0.754787 |
81
+ | 7 | 0.144300 | 0.303378 | 0.758504 | 0.738069 | 0.748147 |
82
+ | 8 | 0.126800 | 0.311300 | 0.745589 | 0.750863 | 0.748217 |
83
+ | 9 | 0.119400 | 0.331631 | 0.739316 | 0.749475 | 0.744361 |
84
+ | 10 | 0.109400 | 0.344823 | 0.754268 | 0.737189 | 0.745631 |
85
+ | 11 | 0.102900 | 0.354887 | 0.751948 | 0.741285 | 0.746578 |
86
+
87
+ ### Detailed Metrics for XLM-RoBERTa Large Model
88
+
89
+ | Entity | Precision | Recall | F1-score | Support |
90
+ |--------------|-----------|--------|----------|---------|
91
+ | ART | 0.41 | 0.19 | 0.26 | 1828 |
92
+ | DATE | 0.53 | 0.49 | 0.51 | 834 |
93
+ | EVENT | 0.67 | 0.51 | 0.58 | 63 |
94
+ | FACILITY | 0.74 | 0.68 | 0.71 | 1134 |
95
+ | LAW | 0.62 | 0.58 | 0.60 | 1066 |
96
+ | LOCATION | 0.81 | 0.79 | 0.80 | 8795 |
97
+ | MONEY | 0.59 | 0.56 | 0.58 | 555 |
98
+ | ORGANISATION | 0.70 | 0.69 | 0.70 | 554 |
99
+ | PERCENTAGE | 0.80 | 0.82 | 0.81 | 3502 |
100
+ | PERSON | 0.90 | 0.82 | 0.86 | 7007 |
101
+ | PRODUCT | 0.83 | 0.84 | 0.84 | 2624 |
102
+ | TIME | 0.60 | 0.53 | 0.57 | 1584 |
103
+
104
+ ---
105
+
106
  ## Setup and Usage
107
 
108
  1. **Clone the repository**:
 
114
  2. **Create and activate a virtual environment**:
115
  ```bash
116
  python3 -m venv .venv
117
+ source .venv/bin/activate
118
+
119
+ # On Windows use: .venv\Scripts\activate
120
  ```
121
 
122
  3. **Install dependencies**:
 
182
 
183
  Access the web interface through the Fly.io URL or `http://localhost:8080` (if running locally) to test the NER model and view recognized entities.
184
 
185
+ This application leverages the XLM-RoBERTa Large model fine-tuned on Azerbaijani language data for high-accuracy named entity recognition.