File size: 23,123 Bytes
bb10c3c
 
 
 
 
 
1a30f17
bb10c3c
 
 
 
 
 
1a30f17
bb10c3c
 
 
1a30f17
 
 
 
bb10c3c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1a30f17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bb10c3c
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
---
library_name: transformers
base_model: aubmindlab/bert-base-arabertv02
tags:
- generated_from_trainer
model-index:
- name: Arabic_CrossPrompt_FineTuningAraBERT_noAug_TestTask8_organization
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Arabic_CrossPrompt_FineTuningAraBERT_noAug_TestTask8_organization

This model is a fine-tuned version of [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.6637
- Qwk: 0.6356
- Mse: 0.6637
- Rmse: 0.8147

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Qwk    | Mse    | Rmse   |
|:-------------:|:------:|:----:|:---------------:|:------:|:------:|:------:|
| No log        | 0.0196 | 2    | 4.2952          | 0.0102 | 4.2952 | 2.0725 |
| No log        | 0.0392 | 4    | 3.2266          | 0.0799 | 3.2266 | 1.7963 |
| No log        | 0.0588 | 6    | 1.8412          | 0.1386 | 1.8412 | 1.3569 |
| No log        | 0.0784 | 8    | 1.0988          | 0.2278 | 1.0988 | 1.0482 |
| No log        | 0.0980 | 10   | 1.0968          | 0.2094 | 1.0968 | 1.0473 |
| No log        | 0.1176 | 12   | 1.1545          | 0.0904 | 1.1545 | 1.0745 |
| No log        | 0.1373 | 14   | 1.1518          | 0.2008 | 1.1518 | 1.0732 |
| No log        | 0.1569 | 16   | 1.2856          | 0.1234 | 1.2856 | 1.1338 |
| No log        | 0.1765 | 18   | 1.0710          | 0.2542 | 1.0710 | 1.0349 |
| No log        | 0.1961 | 20   | 0.9405          | 0.4251 | 0.9405 | 0.9698 |
| No log        | 0.2157 | 22   | 0.9575          | 0.3779 | 0.9575 | 0.9785 |
| No log        | 0.2353 | 24   | 1.0108          | 0.3764 | 1.0108 | 1.0054 |
| No log        | 0.2549 | 26   | 1.0748          | 0.3427 | 1.0748 | 1.0367 |
| No log        | 0.2745 | 28   | 1.0006          | 0.4063 | 1.0006 | 1.0003 |
| No log        | 0.2941 | 30   | 0.8637          | 0.5009 | 0.8637 | 0.9293 |
| No log        | 0.3137 | 32   | 0.8053          | 0.4698 | 0.8053 | 0.8974 |
| No log        | 0.3333 | 34   | 0.8135          | 0.4257 | 0.8135 | 0.9019 |
| No log        | 0.3529 | 36   | 0.8487          | 0.4325 | 0.8487 | 0.9213 |
| No log        | 0.3725 | 38   | 0.8912          | 0.4374 | 0.8912 | 0.9440 |
| No log        | 0.3922 | 40   | 0.8144          | 0.4875 | 0.8144 | 0.9024 |
| No log        | 0.4118 | 42   | 0.7211          | 0.5815 | 0.7211 | 0.8492 |
| No log        | 0.4314 | 44   | 0.7169          | 0.5865 | 0.7169 | 0.8467 |
| No log        | 0.4510 | 46   | 0.7229          | 0.5194 | 0.7229 | 0.8502 |
| No log        | 0.4706 | 48   | 0.7373          | 0.5317 | 0.7373 | 0.8586 |
| No log        | 0.4902 | 50   | 0.7474          | 0.5413 | 0.7474 | 0.8645 |
| No log        | 0.5098 | 52   | 0.7460          | 0.5308 | 0.7460 | 0.8637 |
| No log        | 0.5294 | 54   | 0.7572          | 0.5087 | 0.7572 | 0.8702 |
| No log        | 0.5490 | 56   | 0.7854          | 0.4931 | 0.7854 | 0.8863 |
| No log        | 0.5686 | 58   | 0.8381          | 0.4277 | 0.8381 | 0.9155 |
| No log        | 0.5882 | 60   | 0.9500          | 0.3694 | 0.9500 | 0.9747 |
| No log        | 0.6078 | 62   | 0.9132          | 0.4035 | 0.9132 | 0.9556 |
| No log        | 0.6275 | 64   | 0.8686          | 0.4811 | 0.8686 | 0.9320 |
| No log        | 0.6471 | 66   | 0.7388          | 0.5726 | 0.7388 | 0.8595 |
| No log        | 0.6667 | 68   | 0.7300          | 0.5573 | 0.7300 | 0.8544 |
| No log        | 0.6863 | 70   | 0.7859          | 0.5380 | 0.7859 | 0.8865 |
| No log        | 0.7059 | 72   | 0.8167          | 0.5099 | 0.8167 | 0.9037 |
| No log        | 0.7255 | 74   | 0.9063          | 0.5092 | 0.9063 | 0.9520 |
| No log        | 0.7451 | 76   | 0.8688          | 0.4810 | 0.8688 | 0.9321 |
| No log        | 0.7647 | 78   | 0.8280          | 0.5209 | 0.8280 | 0.9100 |
| No log        | 0.7843 | 80   | 0.7573          | 0.5683 | 0.7573 | 0.8702 |
| No log        | 0.8039 | 82   | 0.7654          | 0.5697 | 0.7654 | 0.8749 |
| No log        | 0.8235 | 84   | 0.7306          | 0.5834 | 0.7306 | 0.8547 |
| No log        | 0.8431 | 86   | 0.7655          | 0.5556 | 0.7655 | 0.8749 |
| No log        | 0.8627 | 88   | 1.0433          | 0.4724 | 1.0433 | 1.0214 |
| No log        | 0.8824 | 90   | 1.1994          | 0.4390 | 1.1994 | 1.0952 |
| No log        | 0.9020 | 92   | 0.9817          | 0.5090 | 0.9817 | 0.9908 |
| No log        | 0.9216 | 94   | 0.6939          | 0.5289 | 0.6939 | 0.8330 |
| No log        | 0.9412 | 96   | 0.6782          | 0.5800 | 0.6782 | 0.8235 |
| No log        | 0.9608 | 98   | 0.6669          | 0.5468 | 0.6669 | 0.8166 |
| No log        | 0.9804 | 100  | 0.6804          | 0.5536 | 0.6804 | 0.8249 |
| No log        | 1.0    | 102  | 0.6890          | 0.5708 | 0.6890 | 0.8301 |
| No log        | 1.0196 | 104  | 0.7477          | 0.5812 | 0.7477 | 0.8647 |
| No log        | 1.0392 | 106  | 0.7234          | 0.6064 | 0.7234 | 0.8505 |
| No log        | 1.0588 | 108  | 0.6907          | 0.5935 | 0.6907 | 0.8311 |
| No log        | 1.0784 | 110  | 0.7004          | 0.6393 | 0.7004 | 0.8369 |
| No log        | 1.0980 | 112  | 0.6759          | 0.6637 | 0.6759 | 0.8221 |
| No log        | 1.1176 | 114  | 0.6917          | 0.6437 | 0.6917 | 0.8317 |
| No log        | 1.1373 | 116  | 0.6841          | 0.6441 | 0.6841 | 0.8271 |
| No log        | 1.1569 | 118  | 0.7122          | 0.5814 | 0.7122 | 0.8439 |
| No log        | 1.1765 | 120  | 0.8183          | 0.4949 | 0.8183 | 0.9046 |
| No log        | 1.1961 | 122  | 0.8523          | 0.4799 | 0.8523 | 0.9232 |
| No log        | 1.2157 | 124  | 0.8479          | 0.5240 | 0.8479 | 0.9208 |
| No log        | 1.2353 | 126  | 0.8277          | 0.5782 | 0.8277 | 0.9098 |
| No log        | 1.2549 | 128  | 0.8707          | 0.6108 | 0.8707 | 0.9331 |
| No log        | 1.2745 | 130  | 0.7377          | 0.6903 | 0.7377 | 0.8589 |
| No log        | 1.2941 | 132  | 0.6445          | 0.6575 | 0.6445 | 0.8028 |
| No log        | 1.3137 | 134  | 0.6641          | 0.6654 | 0.6641 | 0.8149 |
| No log        | 1.3333 | 136  | 0.7284          | 0.6182 | 0.7284 | 0.8534 |
| No log        | 1.3529 | 138  | 0.7679          | 0.6292 | 0.7679 | 0.8763 |
| No log        | 1.3725 | 140  | 0.6204          | 0.6527 | 0.6204 | 0.7877 |
| No log        | 1.3922 | 142  | 0.6521          | 0.6035 | 0.6521 | 0.8075 |
| No log        | 1.4118 | 144  | 0.6780          | 0.5941 | 0.6780 | 0.8234 |
| No log        | 1.4314 | 146  | 0.6080          | 0.6319 | 0.6080 | 0.7797 |
| No log        | 1.4510 | 148  | 0.6093          | 0.6444 | 0.6093 | 0.7806 |
| No log        | 1.4706 | 150  | 0.7406          | 0.6425 | 0.7406 | 0.8606 |
| No log        | 1.4902 | 152  | 0.7195          | 0.6664 | 0.7195 | 0.8482 |
| No log        | 1.5098 | 154  | 0.7023          | 0.6695 | 0.7023 | 0.8381 |
| No log        | 1.5294 | 156  | 0.6536          | 0.6639 | 0.6536 | 0.8085 |
| No log        | 1.5490 | 158  | 0.7030          | 0.6854 | 0.7030 | 0.8384 |
| No log        | 1.5686 | 160  | 0.7469          | 0.6724 | 0.7469 | 0.8642 |
| No log        | 1.5882 | 162  | 0.6844          | 0.6661 | 0.6844 | 0.8273 |
| No log        | 1.6078 | 164  | 0.6306          | 0.6142 | 0.6306 | 0.7941 |
| No log        | 1.6275 | 166  | 0.6222          | 0.6111 | 0.6222 | 0.7888 |
| No log        | 1.6471 | 168  | 0.7040          | 0.6631 | 0.7040 | 0.8391 |
| No log        | 1.6667 | 170  | 0.8624          | 0.6188 | 0.8624 | 0.9286 |
| No log        | 1.6863 | 172  | 0.9003          | 0.5841 | 0.9003 | 0.9489 |
| No log        | 1.7059 | 174  | 0.7114          | 0.6047 | 0.7114 | 0.8434 |
| No log        | 1.7255 | 176  | 0.6270          | 0.6387 | 0.6270 | 0.7918 |
| No log        | 1.7451 | 178  | 0.7834          | 0.6244 | 0.7834 | 0.8851 |
| No log        | 1.7647 | 180  | 0.7746          | 0.6135 | 0.7746 | 0.8801 |
| No log        | 1.7843 | 182  | 0.6805          | 0.5776 | 0.6805 | 0.8249 |
| No log        | 1.8039 | 184  | 0.6996          | 0.4999 | 0.6996 | 0.8364 |
| No log        | 1.8235 | 186  | 0.8609          | 0.5141 | 0.8609 | 0.9279 |
| No log        | 1.8431 | 188  | 0.9514          | 0.4900 | 0.9514 | 0.9754 |
| No log        | 1.8627 | 190  | 0.9757          | 0.5319 | 0.9757 | 0.9878 |
| No log        | 1.8824 | 192  | 0.8713          | 0.5884 | 0.8713 | 0.9335 |
| No log        | 1.9020 | 194  | 0.6578          | 0.6685 | 0.6578 | 0.8110 |
| No log        | 1.9216 | 196  | 0.6701          | 0.6870 | 0.6701 | 0.8186 |
| No log        | 1.9412 | 198  | 0.8171          | 0.6909 | 0.8171 | 0.9040 |
| No log        | 1.9608 | 200  | 0.9065          | 0.6280 | 0.9065 | 0.9521 |
| No log        | 1.9804 | 202  | 0.7902          | 0.6462 | 0.7902 | 0.8889 |
| No log        | 2.0    | 204  | 0.6671          | 0.6583 | 0.6671 | 0.8167 |
| No log        | 2.0196 | 206  | 0.6646          | 0.6442 | 0.6646 | 0.8152 |
| No log        | 2.0392 | 208  | 0.7101          | 0.6132 | 0.7101 | 0.8426 |
| No log        | 2.0588 | 210  | 0.7350          | 0.6091 | 0.7350 | 0.8573 |
| No log        | 2.0784 | 212  | 0.7174          | 0.6268 | 0.7174 | 0.8470 |
| No log        | 2.0980 | 214  | 0.7046          | 0.6320 | 0.7046 | 0.8394 |
| No log        | 2.1176 | 216  | 0.6900          | 0.6289 | 0.6900 | 0.8307 |
| No log        | 2.1373 | 218  | 0.6344          | 0.6392 | 0.6344 | 0.7965 |
| No log        | 2.1569 | 220  | 0.6480          | 0.6577 | 0.6480 | 0.8050 |
| No log        | 2.1765 | 222  | 0.7225          | 0.6413 | 0.7225 | 0.8500 |
| No log        | 2.1961 | 224  | 0.6654          | 0.6659 | 0.6654 | 0.8157 |
| No log        | 2.2157 | 226  | 0.5786          | 0.6383 | 0.5786 | 0.7606 |
| No log        | 2.2353 | 228  | 0.5798          | 0.6740 | 0.5798 | 0.7615 |
| No log        | 2.2549 | 230  | 0.5872          | 0.6672 | 0.5872 | 0.7663 |
| No log        | 2.2745 | 232  | 0.6322          | 0.6909 | 0.6322 | 0.7951 |
| No log        | 2.2941 | 234  | 0.7586          | 0.6493 | 0.7586 | 0.8710 |
| No log        | 2.3137 | 236  | 0.7852          | 0.6383 | 0.7852 | 0.8861 |
| No log        | 2.3333 | 238  | 0.7052          | 0.6798 | 0.7052 | 0.8398 |
| No log        | 2.3529 | 240  | 0.6188          | 0.6954 | 0.6188 | 0.7866 |
| No log        | 2.3725 | 242  | 0.6620          | 0.6889 | 0.6620 | 0.8136 |
| No log        | 2.3922 | 244  | 0.6580          | 0.6528 | 0.6580 | 0.8112 |
| No log        | 2.4118 | 246  | 0.6573          | 0.6404 | 0.6573 | 0.8107 |
| No log        | 2.4314 | 248  | 0.8856          | 0.5867 | 0.8856 | 0.9411 |
| No log        | 2.4510 | 250  | 1.1315          | 0.5154 | 1.1315 | 1.0637 |
| No log        | 2.4706 | 252  | 1.0135          | 0.5270 | 1.0135 | 1.0067 |
| No log        | 2.4902 | 254  | 0.7767          | 0.5464 | 0.7767 | 0.8813 |
| No log        | 2.5098 | 256  | 0.6975          | 0.5738 | 0.6975 | 0.8352 |
| No log        | 2.5294 | 258  | 0.7170          | 0.5700 | 0.7170 | 0.8468 |
| No log        | 2.5490 | 260  | 0.8040          | 0.5864 | 0.8040 | 0.8966 |
| No log        | 2.5686 | 262  | 0.7430          | 0.6235 | 0.7430 | 0.8620 |
| No log        | 2.5882 | 264  | 0.6681          | 0.6426 | 0.6681 | 0.8174 |
| No log        | 2.6078 | 266  | 0.6818          | 0.6616 | 0.6818 | 0.8257 |
| No log        | 2.6275 | 268  | 0.7374          | 0.6819 | 0.7374 | 0.8587 |
| No log        | 2.6471 | 270  | 0.6832          | 0.6952 | 0.6832 | 0.8266 |
| No log        | 2.6667 | 272  | 0.6918          | 0.6998 | 0.6918 | 0.8318 |
| No log        | 2.6863 | 274  | 0.9110          | 0.6552 | 0.9110 | 0.9544 |
| No log        | 2.7059 | 276  | 1.1775          | 0.6085 | 1.1775 | 1.0851 |
| No log        | 2.7255 | 278  | 1.0729          | 0.6115 | 1.0729 | 1.0358 |
| No log        | 2.7451 | 280  | 0.8355          | 0.6284 | 0.8355 | 0.9140 |
| No log        | 2.7647 | 282  | 0.6621          | 0.6563 | 0.6621 | 0.8137 |
| No log        | 2.7843 | 284  | 0.6377          | 0.6714 | 0.6377 | 0.7985 |
| No log        | 2.8039 | 286  | 0.6439          | 0.6876 | 0.6439 | 0.8024 |
| No log        | 2.8235 | 288  | 0.6961          | 0.6412 | 0.6961 | 0.8343 |
| No log        | 2.8431 | 290  | 0.9360          | 0.5822 | 0.9360 | 0.9675 |
| No log        | 2.8627 | 292  | 1.0942          | 0.5635 | 1.0942 | 1.0461 |
| No log        | 2.8824 | 294  | 0.9331          | 0.5735 | 0.9331 | 0.9660 |
| No log        | 2.9020 | 296  | 0.7332          | 0.6367 | 0.7332 | 0.8562 |
| No log        | 2.9216 | 298  | 0.6417          | 0.6417 | 0.6417 | 0.8010 |
| No log        | 2.9412 | 300  | 0.6453          | 0.6413 | 0.6453 | 0.8033 |
| No log        | 2.9608 | 302  | 0.7120          | 0.6062 | 0.7120 | 0.8438 |
| No log        | 2.9804 | 304  | 0.6610          | 0.6344 | 0.6610 | 0.8130 |
| No log        | 3.0    | 306  | 0.6399          | 0.6456 | 0.6399 | 0.7999 |
| No log        | 3.0196 | 308  | 0.8032          | 0.6526 | 0.8032 | 0.8962 |
| No log        | 3.0392 | 310  | 0.8972          | 0.6433 | 0.8972 | 0.9472 |
| No log        | 3.0588 | 312  | 0.9137          | 0.6394 | 0.9137 | 0.9559 |
| No log        | 3.0784 | 314  | 0.7396          | 0.6746 | 0.7396 | 0.8600 |
| No log        | 3.0980 | 316  | 0.6958          | 0.6370 | 0.6958 | 0.8341 |
| No log        | 3.1176 | 318  | 0.7007          | 0.6299 | 0.7007 | 0.8371 |
| No log        | 3.1373 | 320  | 0.6408          | 0.5930 | 0.6408 | 0.8005 |
| No log        | 3.1569 | 322  | 0.6358          | 0.5737 | 0.6358 | 0.7974 |
| No log        | 3.1765 | 324  | 0.6347          | 0.5766 | 0.6347 | 0.7967 |
| No log        | 3.1961 | 326  | 0.6488          | 0.6196 | 0.6488 | 0.8055 |
| No log        | 3.2157 | 328  | 0.7554          | 0.6606 | 0.7554 | 0.8691 |
| No log        | 3.2353 | 330  | 0.7893          | 0.6370 | 0.7893 | 0.8884 |
| No log        | 3.2549 | 332  | 0.8156          | 0.6470 | 0.8156 | 0.9031 |
| No log        | 3.2745 | 334  | 0.6970          | 0.6564 | 0.6970 | 0.8349 |
| No log        | 3.2941 | 336  | 0.6706          | 0.6599 | 0.6706 | 0.8189 |
| No log        | 3.3137 | 338  | 0.6852          | 0.6362 | 0.6852 | 0.8278 |
| No log        | 3.3333 | 340  | 0.6440          | 0.6605 | 0.6440 | 0.8025 |
| No log        | 3.3529 | 342  | 0.7842          | 0.6466 | 0.7842 | 0.8855 |
| No log        | 3.3725 | 344  | 1.2209          | 0.5235 | 1.2209 | 1.1050 |
| No log        | 3.3922 | 346  | 1.3001          | 0.4906 | 1.3001 | 1.1402 |
| No log        | 3.4118 | 348  | 0.9946          | 0.5760 | 0.9946 | 0.9973 |
| No log        | 3.4314 | 350  | 0.6702          | 0.6678 | 0.6702 | 0.8186 |
| No log        | 3.4510 | 352  | 0.6631          | 0.6439 | 0.6631 | 0.8143 |
| No log        | 3.4706 | 354  | 0.6758          | 0.6311 | 0.6758 | 0.8220 |
| No log        | 3.4902 | 356  | 0.6604          | 0.6891 | 0.6604 | 0.8126 |
| No log        | 3.5098 | 358  | 0.9361          | 0.6100 | 0.9361 | 0.9675 |
| No log        | 3.5294 | 360  | 1.2398          | 0.5575 | 1.2398 | 1.1135 |
| No log        | 3.5490 | 362  | 1.1421          | 0.5672 | 1.1421 | 1.0687 |
| No log        | 3.5686 | 364  | 0.8577          | 0.6494 | 0.8577 | 0.9261 |
| No log        | 3.5882 | 366  | 0.7330          | 0.6684 | 0.7330 | 0.8561 |
| No log        | 3.6078 | 368  | 0.7924          | 0.6438 | 0.7924 | 0.8902 |
| No log        | 3.6275 | 370  | 0.7405          | 0.6515 | 0.7405 | 0.8605 |
| No log        | 3.6471 | 372  | 0.6868          | 0.6605 | 0.6868 | 0.8288 |
| No log        | 3.6667 | 374  | 0.7430          | 0.6562 | 0.7430 | 0.8619 |
| No log        | 3.6863 | 376  | 0.9368          | 0.5893 | 0.9368 | 0.9679 |
| No log        | 3.7059 | 378  | 1.0358          | 0.5665 | 1.0358 | 1.0177 |
| No log        | 3.7255 | 380  | 1.0289          | 0.5772 | 1.0289 | 1.0144 |
| No log        | 3.7451 | 382  | 0.8941          | 0.6492 | 0.8941 | 0.9456 |
| No log        | 3.7647 | 384  | 0.7306          | 0.6657 | 0.7306 | 0.8547 |
| No log        | 3.7843 | 386  | 0.7037          | 0.6764 | 0.7037 | 0.8389 |
| No log        | 3.8039 | 388  | 0.7334          | 0.6577 | 0.7334 | 0.8564 |
| No log        | 3.8235 | 390  | 0.7398          | 0.6592 | 0.7398 | 0.8601 |
| No log        | 3.8431 | 392  | 0.7918          | 0.6615 | 0.7918 | 0.8898 |
| No log        | 3.8627 | 394  | 0.8550          | 0.6641 | 0.8550 | 0.9247 |
| No log        | 3.8824 | 396  | 0.9668          | 0.6570 | 0.9668 | 0.9832 |
| No log        | 3.9020 | 398  | 0.7953          | 0.6655 | 0.7953 | 0.8918 |
| No log        | 3.9216 | 400  | 0.6472          | 0.6941 | 0.6472 | 0.8045 |
| No log        | 3.9412 | 402  | 0.6327          | 0.6735 | 0.6327 | 0.7954 |
| No log        | 3.9608 | 404  | 0.6214          | 0.6756 | 0.6214 | 0.7883 |
| No log        | 3.9804 | 406  | 0.7604          | 0.6664 | 0.7604 | 0.8720 |
| No log        | 4.0    | 408  | 0.9570          | 0.5896 | 0.9570 | 0.9783 |
| No log        | 4.0196 | 410  | 0.8949          | 0.6154 | 0.8949 | 0.9460 |
| No log        | 4.0392 | 412  | 0.8523          | 0.6359 | 0.8523 | 0.9232 |
| No log        | 4.0588 | 414  | 0.6731          | 0.6932 | 0.6731 | 0.8204 |
| No log        | 4.0784 | 416  | 0.6122          | 0.7033 | 0.6122 | 0.7824 |
| No log        | 4.0980 | 418  | 0.6572          | 0.6972 | 0.6572 | 0.8107 |
| No log        | 4.1176 | 420  | 0.7898          | 0.6984 | 0.7898 | 0.8887 |
| No log        | 4.1373 | 422  | 1.2016          | 0.6200 | 1.2016 | 1.0962 |
| No log        | 4.1569 | 424  | 1.2801          | 0.6165 | 1.2801 | 1.1314 |
| No log        | 4.1765 | 426  | 1.0264          | 0.6515 | 1.0264 | 1.0131 |
| No log        | 4.1961 | 428  | 0.7693          | 0.6649 | 0.7693 | 0.8771 |
| No log        | 4.2157 | 430  | 0.6625          | 0.6731 | 0.6625 | 0.8139 |
| No log        | 4.2353 | 432  | 0.6647          | 0.6333 | 0.6647 | 0.8153 |
| No log        | 4.2549 | 434  | 0.7027          | 0.6205 | 0.7027 | 0.8383 |
| No log        | 4.2745 | 436  | 0.8217          | 0.5862 | 0.8217 | 0.9065 |
| No log        | 4.2941 | 438  | 0.8034          | 0.6025 | 0.8034 | 0.8963 |
| No log        | 4.3137 | 440  | 0.7041          | 0.6427 | 0.7041 | 0.8391 |
| No log        | 4.3333 | 442  | 0.6447          | 0.6633 | 0.6447 | 0.8029 |
| No log        | 4.3529 | 444  | 0.6396          | 0.6758 | 0.6396 | 0.7997 |
| No log        | 4.3725 | 446  | 0.6819          | 0.6481 | 0.6819 | 0.8258 |
| No log        | 4.3922 | 448  | 0.7204          | 0.6694 | 0.7204 | 0.8488 |
| No log        | 4.4118 | 450  | 0.7344          | 0.6998 | 0.7344 | 0.8570 |
| No log        | 4.4314 | 452  | 0.7389          | 0.7045 | 0.7389 | 0.8596 |
| No log        | 4.4510 | 454  | 0.7830          | 0.6996 | 0.7830 | 0.8849 |
| No log        | 4.4706 | 456  | 0.7596          | 0.7124 | 0.7596 | 0.8715 |
| No log        | 4.4902 | 458  | 0.7509          | 0.7168 | 0.7509 | 0.8665 |
| No log        | 4.5098 | 460  | 0.7286          | 0.6891 | 0.7286 | 0.8536 |
| No log        | 4.5294 | 462  | 0.7226          | 0.6656 | 0.7226 | 0.8501 |
| No log        | 4.5490 | 464  | 0.8479          | 0.6008 | 0.8479 | 0.9208 |
| No log        | 4.5686 | 466  | 0.9028          | 0.5806 | 0.9028 | 0.9501 |
| No log        | 4.5882 | 468  | 0.7854          | 0.6231 | 0.7854 | 0.8862 |
| No log        | 4.6078 | 470  | 0.6817          | 0.6583 | 0.6817 | 0.8256 |
| No log        | 4.6275 | 472  | 0.6759          | 0.6559 | 0.6759 | 0.8222 |
| No log        | 4.6471 | 474  | 0.6931          | 0.6797 | 0.6931 | 0.8325 |
| No log        | 4.6667 | 476  | 0.8320          | 0.6271 | 0.8320 | 0.9121 |
| No log        | 4.6863 | 478  | 0.9053          | 0.6113 | 0.9053 | 0.9515 |
| No log        | 4.7059 | 480  | 0.9333          | 0.6096 | 0.9333 | 0.9661 |
| No log        | 4.7255 | 482  | 0.8155          | 0.6333 | 0.8155 | 0.9031 |
| No log        | 4.7451 | 484  | 0.7290          | 0.6645 | 0.7290 | 0.8538 |
| No log        | 4.7647 | 486  | 0.6305          | 0.6741 | 0.6305 | 0.7941 |
| No log        | 4.7843 | 488  | 0.5985          | 0.6800 | 0.5985 | 0.7736 |
| No log        | 4.8039 | 490  | 0.6149          | 0.6623 | 0.6149 | 0.7841 |
| No log        | 4.8235 | 492  | 0.6850          | 0.6384 | 0.6850 | 0.8277 |
| No log        | 4.8431 | 494  | 0.7936          | 0.6211 | 0.7936 | 0.8908 |
| No log        | 4.8627 | 496  | 0.8023          | 0.6104 | 0.8023 | 0.8957 |
| No log        | 4.8824 | 498  | 0.7001          | 0.6004 | 0.7001 | 0.8367 |
| 0.5778        | 4.9020 | 500  | 0.6068          | 0.6467 | 0.6068 | 0.7790 |
| 0.5778        | 4.9216 | 502  | 0.6518          | 0.6368 | 0.6518 | 0.8073 |
| 0.5778        | 4.9412 | 504  | 0.6472          | 0.6654 | 0.6472 | 0.8045 |
| 0.5778        | 4.9608 | 506  | 0.6070          | 0.6702 | 0.6070 | 0.7791 |
| 0.5778        | 4.9804 | 508  | 0.7523          | 0.6564 | 0.7523 | 0.8674 |
| 0.5778        | 5.0    | 510  | 0.8719          | 0.6426 | 0.8719 | 0.9337 |
| 0.5778        | 5.0196 | 512  | 0.8401          | 0.6582 | 0.8401 | 0.9166 |
| 0.5778        | 5.0392 | 514  | 0.6989          | 0.7079 | 0.6989 | 0.8360 |
| 0.5778        | 5.0588 | 516  | 0.6384          | 0.6734 | 0.6384 | 0.7990 |
| 0.5778        | 5.0784 | 518  | 0.6617          | 0.6538 | 0.6617 | 0.8134 |
| 0.5778        | 5.0980 | 520  | 0.6237          | 0.6772 | 0.6237 | 0.7897 |
| 0.5778        | 5.1176 | 522  | 0.6969          | 0.6928 | 0.6969 | 0.8348 |
| 0.5778        | 5.1373 | 524  | 0.8394          | 0.6478 | 0.8394 | 0.9162 |
| 0.5778        | 5.1569 | 526  | 0.8070          | 0.6241 | 0.8070 | 0.8983 |
| 0.5778        | 5.1765 | 528  | 0.7122          | 0.6371 | 0.7122 | 0.8439 |
| 0.5778        | 5.1961 | 530  | 0.6388          | 0.6672 | 0.6388 | 0.7992 |
| 0.5778        | 5.2157 | 532  | 0.6036          | 0.6680 | 0.6036 | 0.7769 |
| 0.5778        | 5.2353 | 534  | 0.6094          | 0.6504 | 0.6094 | 0.7807 |
| 0.5778        | 5.2549 | 536  | 0.6240          | 0.6934 | 0.6240 | 0.7900 |
| 0.5778        | 5.2745 | 538  | 0.7745          | 0.6495 | 0.7745 | 0.8800 |
| 0.5778        | 5.2941 | 540  | 0.9460          | 0.5997 | 0.9460 | 0.9726 |
| 0.5778        | 5.3137 | 542  | 1.0257          | 0.5844 | 1.0257 | 1.0128 |
| 0.5778        | 5.3333 | 544  | 0.8669          | 0.6026 | 0.8669 | 0.9311 |
| 0.5778        | 5.3529 | 546  | 0.6861          | 0.6424 | 0.6861 | 0.8283 |
| 0.5778        | 5.3725 | 548  | 0.6637          | 0.6356 | 0.6637 | 0.8147 |


### Framework versions

- Transformers 4.44.2
- Pytorch 2.4.0+cu118
- Datasets 2.21.0
- Tokenizers 0.19.1