File size: 15,731 Bytes
657388e
 
c0fc64f
 
 
 
 
 
 
 
657388e
c0fc64f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
---
license: apache-2.0
base_model: t5-small
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: war_tl_model
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# war_tl_model

This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0083
- Bleu: 95.2691
- Gen Len: 5.3275

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 200
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
| No log        | 1.0   | 54    | 2.8093          | 2.5958  | 6.0523  |
| No log        | 2.0   | 108   | 2.4043          | 3.1846  | 6.1382  |
| No log        | 3.0   | 162   | 1.9327          | 6.8308  | 6.4901  |
| No log        | 4.0   | 216   | 1.5969          | 13.8714 | 5.6562  |
| No log        | 5.0   | 270   | 1.2099          | 20.5562 | 5.9721  |
| No log        | 6.0   | 324   | 0.9304          | 31.1495 | 5.7038  |
| No log        | 7.0   | 378   | 0.7074          | 43.6407 | 5.7619  |
| No log        | 8.0   | 432   | 0.5408          | 49.2356 | 5.5772  |
| No log        | 9.0   | 486   | 0.3822          | 63.1038 | 5.5528  |
| 1.9648        | 10.0  | 540   | 0.2888          | 67.2835 | 5.5041  |
| 1.9648        | 11.0  | 594   | 0.1852          | 72.4324 | 5.3449  |
| 1.9648        | 12.0  | 648   | 0.1235          | 84.0315 | 5.36    |
| 1.9648        | 13.0  | 702   | 0.0831          | 88.3721 | 5.374   |
| 1.9648        | 14.0  | 756   | 0.0629          | 87.43   | 5.3531  |
| 1.9648        | 15.0  | 810   | 0.0515          | 88.0698 | 5.3577  |
| 1.9648        | 16.0  | 864   | 0.0526          | 89.6299 | 5.3902  |
| 1.9648        | 17.0  | 918   | 0.0454          | 89.7151 | 5.3879  |
| 1.9648        | 18.0  | 972   | 0.0434          | 88.0326 | 5.3879  |
| 0.4211        | 19.0  | 1026  | 0.0375          | 89.9125 | 5.3229  |
| 0.4211        | 20.0  | 1080  | 0.0295          | 91.976  | 5.3554  |
| 0.4211        | 21.0  | 1134  | 0.0441          | 91.7403 | 5.3693  |
| 0.4211        | 22.0  | 1188  | 0.0290          | 92.0153 | 5.3461  |
| 0.4211        | 23.0  | 1242  | 0.0318          | 90.8522 | 5.3391  |
| 0.4211        | 24.0  | 1296  | 0.0343          | 91.9239 | 5.3856  |
| 0.4211        | 25.0  | 1350  | 0.0260          | 87.7878 | 5.3519  |
| 0.4211        | 26.0  | 1404  | 0.0332          | 90.3633 | 5.3751  |
| 0.4211        | 27.0  | 1458  | 0.0269          | 92.1404 | 5.3717  |
| 0.1559        | 28.0  | 1512  | 0.0323          | 93.0887 | 5.36    |
| 0.1559        | 29.0  | 1566  | 0.0326          | 94.5354 | 5.3566  |
| 0.1559        | 30.0  | 1620  | 0.0314          | 93.4507 | 5.374   |
| 0.1559        | 31.0  | 1674  | 0.0297          | 94.7939 | 5.3357  |
| 0.1559        | 32.0  | 1728  | 0.0282          | 92.2858 | 5.3531  |
| 0.1559        | 33.0  | 1782  | 0.0258          | 92.4661 | 5.3508  |
| 0.1559        | 34.0  | 1836  | 0.0252          | 91.6147 | 5.3577  |
| 0.1559        | 35.0  | 1890  | 0.0240          | 93.2291 | 5.3728  |
| 0.1559        | 36.0  | 1944  | 0.0157          | 93.4177 | 5.3844  |
| 0.1559        | 37.0  | 1998  | 0.0212          | 94.0209 | 5.3589  |
| 0.093         | 38.0  | 2052  | 0.0199          | 93.1765 | 5.3728  |
| 0.093         | 39.0  | 2106  | 0.0257          | 93.9608 | 5.3624  |
| 0.093         | 40.0  | 2160  | 0.0232          | 93.9594 | 5.3717  |
| 0.093         | 41.0  | 2214  | 0.0198          | 93.5332 | 5.3519  |
| 0.093         | 42.0  | 2268  | 0.0150          | 93.9354 | 5.3682  |
| 0.093         | 43.0  | 2322  | 0.0156          | 94.5189 | 5.3566  |
| 0.093         | 44.0  | 2376  | 0.0170          | 92.767  | 5.36    |
| 0.093         | 45.0  | 2430  | 0.0178          | 95.2076 | 5.3519  |
| 0.093         | 46.0  | 2484  | 0.0217          | 93.4226 | 5.3995  |
| 0.0655        | 47.0  | 2538  | 0.0181          | 93.0419 | 5.3612  |
| 0.0655        | 48.0  | 2592  | 0.0185          | 94.4578 | 5.3589  |
| 0.0655        | 49.0  | 2646  | 0.0210          | 93.3838 | 5.3577  |
| 0.0655        | 50.0  | 2700  | 0.0152          | 93.883  | 5.331   |
| 0.0655        | 51.0  | 2754  | 0.0182          | 93.8614 | 5.3914  |
| 0.0655        | 52.0  | 2808  | 0.0160          | 94.1816 | 5.3426  |
| 0.0655        | 53.0  | 2862  | 0.0158          | 94.2294 | 5.3484  |
| 0.0655        | 54.0  | 2916  | 0.0135          | 94.4382 | 5.3508  |
| 0.0655        | 55.0  | 2970  | 0.0151          | 93.8986 | 5.3612  |
| 0.0517        | 56.0  | 3024  | 0.0113          | 95.2691 | 5.3484  |
| 0.0517        | 57.0  | 3078  | 0.0130          | 95.0307 | 5.3519  |
| 0.0517        | 58.0  | 3132  | 0.0137          | 95.2281 | 5.3705  |
| 0.0517        | 59.0  | 3186  | 0.0115          | 95.2281 | 5.3786  |
| 0.0517        | 60.0  | 3240  | 0.0130          | 95.2486 | 5.3589  |
| 0.0517        | 61.0  | 3294  | 0.0119          | 95.2486 | 5.3635  |
| 0.0517        | 62.0  | 3348  | 0.0134          | 95.2486 | 5.3473  |
| 0.0517        | 63.0  | 3402  | 0.0151          | 95.1871 | 5.3798  |
| 0.0517        | 64.0  | 3456  | 0.0141          | 95.2076 | 5.3566  |
| 0.0357        | 65.0  | 3510  | 0.0139          | 94.6668 | 5.3566  |
| 0.0357        | 66.0  | 3564  | 0.0122          | 95.2281 | 5.3403  |
| 0.0357        | 67.0  | 3618  | 0.0172          | 95.2076 | 5.3484  |
| 0.0357        | 68.0  | 3672  | 0.0162          | 94.7725 | 5.3403  |
| 0.0357        | 69.0  | 3726  | 0.0121          | 95.2281 | 5.3473  |
| 0.0357        | 70.0  | 3780  | 0.0163          | 94.6668 | 5.3624  |
| 0.0357        | 71.0  | 3834  | 0.0117          | 95.2486 | 5.3473  |
| 0.0357        | 72.0  | 3888  | 0.0151          | 95.2486 | 5.3566  |
| 0.0357        | 73.0  | 3942  | 0.0104          | 95.2691 | 5.3554  |
| 0.0357        | 74.0  | 3996  | 0.0098          | 95.2691 | 5.3415  |
| 0.0342        | 75.0  | 4050  | 0.0117          | 95.2486 | 5.3438  |
| 0.0342        | 76.0  | 4104  | 0.0125          | 94.6872 | 5.367   |
| 0.0342        | 77.0  | 4158  | 0.0103          | 95.2486 | 5.3461  |
| 0.0342        | 78.0  | 4212  | 0.0113          | 95.2281 | 5.3635  |
| 0.0342        | 79.0  | 4266  | 0.0119          | 95.2691 | 5.374   |
| 0.0342        | 80.0  | 4320  | 0.0132          | 93.4378 | 5.3577  |
| 0.0342        | 81.0  | 4374  | 0.0102          | 94.728  | 5.3496  |
| 0.0342        | 82.0  | 4428  | 0.0156          | 94.6872 | 5.3821  |
| 0.0342        | 83.0  | 4482  | 0.0097          | 94.728  | 5.3357  |
| 0.0292        | 84.0  | 4536  | 0.0096          | 95.2486 | 5.3693  |
| 0.0292        | 85.0  | 4590  | 0.0104          | 95.2691 | 5.3647  |
| 0.0292        | 86.0  | 4644  | 0.0110          | 94.7064 | 5.3612  |
| 0.0292        | 87.0  | 4698  | 0.0094          | 94.7268 | 5.3496  |
| 0.0292        | 88.0  | 4752  | 0.0115          | 95.2486 | 5.36    |
| 0.0292        | 89.0  | 4806  | 0.0098          | 95.2691 | 5.36    |
| 0.0292        | 90.0  | 4860  | 0.0104          | 94.5404 | 5.3461  |
| 0.0292        | 91.0  | 4914  | 0.0103          | 94.6538 | 5.36    |
| 0.0292        | 92.0  | 4968  | 0.0096          | 95.2691 | 5.3624  |
| 0.0243        | 93.0  | 5022  | 0.0092          | 95.2486 | 5.3647  |
| 0.0243        | 94.0  | 5076  | 0.0095          | 95.2691 | 5.3461  |
| 0.0243        | 95.0  | 5130  | 0.0105          | 95.0189 | 5.3508  |
| 0.0243        | 96.0  | 5184  | 0.0111          | 95.1994 | 5.3763  |
| 0.0243        | 97.0  | 5238  | 0.0099          | 95.2691 | 5.3717  |
| 0.0243        | 98.0  | 5292  | 0.0102          | 95.2691 | 5.3484  |
| 0.0243        | 99.0  | 5346  | 0.0101          | 95.2691 | 5.374   |
| 0.0243        | 100.0 | 5400  | 0.0097          | 95.2486 | 5.3426  |
| 0.0243        | 101.0 | 5454  | 0.0095          | 95.2691 | 5.3508  |
| 0.0233        | 102.0 | 5508  | 0.0098          | 95.2691 | 5.3531  |
| 0.0233        | 103.0 | 5562  | 0.0095          | 95.2691 | 5.3624  |
| 0.0233        | 104.0 | 5616  | 0.0091          | 95.2691 | 5.3461  |
| 0.0233        | 105.0 | 5670  | 0.0105          | 95.2691 | 5.36    |
| 0.0233        | 106.0 | 5724  | 0.0137          | 95.2486 | 5.3554  |
| 0.0233        | 107.0 | 5778  | 0.0108          | 95.2691 | 5.3577  |
| 0.0233        | 108.0 | 5832  | 0.0094          | 95.2691 | 5.3717  |
| 0.0233        | 109.0 | 5886  | 0.0095          | 95.2691 | 5.3531  |
| 0.0233        | 110.0 | 5940  | 0.0096          | 95.2691 | 5.3415  |
| 0.0233        | 111.0 | 5994  | 0.0094          | 95.2486 | 5.3589  |
| 0.02          | 112.0 | 6048  | 0.0092          | 95.2486 | 5.3519  |
| 0.02          | 113.0 | 6102  | 0.0091          | 94.905  | 5.3635  |
| 0.02          | 114.0 | 6156  | 0.0091          | 95.2691 | 5.3624  |
| 0.02          | 115.0 | 6210  | 0.0090          | 95.2691 | 5.3368  |
| 0.02          | 116.0 | 6264  | 0.0094          | 95.2486 | 5.3542  |
| 0.02          | 117.0 | 6318  | 0.0133          | 95.2486 | 5.3519  |
| 0.02          | 118.0 | 6372  | 0.0112          | 95.2691 | 5.3531  |
| 0.02          | 119.0 | 6426  | 0.0115          | 95.2486 | 5.3496  |
| 0.02          | 120.0 | 6480  | 0.0091          | 95.2691 | 5.3391  |
| 0.0181        | 121.0 | 6534  | 0.0089          | 95.2691 | 5.3368  |
| 0.0181        | 122.0 | 6588  | 0.0090          | 95.2691 | 5.3647  |
| 0.0181        | 123.0 | 6642  | 0.0096          | 95.2691 | 5.3786  |
| 0.0181        | 124.0 | 6696  | 0.0091          | 95.2691 | 5.381   |
| 0.0181        | 125.0 | 6750  | 0.0093          | 95.2691 | 5.3531  |
| 0.0181        | 126.0 | 6804  | 0.0098          | 95.2691 | 5.3554  |
| 0.0181        | 127.0 | 6858  | 0.0093          | 95.2691 | 5.3624  |
| 0.0181        | 128.0 | 6912  | 0.0089          | 95.2691 | 5.3693  |
| 0.0181        | 129.0 | 6966  | 0.0088          | 95.2691 | 5.374   |
| 0.0155        | 130.0 | 7020  | 0.0094          | 95.2691 | 5.36    |
| 0.0155        | 131.0 | 7074  | 0.0091          | 95.2691 | 5.3415  |
| 0.0155        | 132.0 | 7128  | 0.0088          | 95.2691 | 5.3484  |
| 0.0155        | 133.0 | 7182  | 0.0090          | 95.2691 | 5.3624  |
| 0.0155        | 134.0 | 7236  | 0.0088          | 95.2691 | 5.3554  |
| 0.0155        | 135.0 | 7290  | 0.0089          | 95.2691 | 5.3693  |
| 0.0155        | 136.0 | 7344  | 0.0090          | 95.2691 | 5.3577  |
| 0.0155        | 137.0 | 7398  | 0.0094          | 95.2486 | 5.3357  |
| 0.0155        | 138.0 | 7452  | 0.0092          | 95.2691 | 5.3368  |
| 0.0147        | 139.0 | 7506  | 0.0090          | 95.2691 | 5.3508  |
| 0.0147        | 140.0 | 7560  | 0.0089          | 95.2691 | 5.3647  |
| 0.0147        | 141.0 | 7614  | 0.0090          | 95.2691 | 5.3577  |
| 0.0147        | 142.0 | 7668  | 0.0089          | 95.2691 | 5.3531  |
| 0.0147        | 143.0 | 7722  | 0.0090          | 95.2691 | 5.3484  |
| 0.0147        | 144.0 | 7776  | 0.0096          | 94.112  | 5.3519  |
| 0.0147        | 145.0 | 7830  | 0.0090          | 95.2691 | 5.3624  |
| 0.0147        | 146.0 | 7884  | 0.0090          | 95.2691 | 5.3647  |
| 0.0147        | 147.0 | 7938  | 0.0090          | 95.2691 | 5.36    |
| 0.0147        | 148.0 | 7992  | 0.0090          | 95.2691 | 5.3647  |
| 0.0146        | 149.0 | 8046  | 0.0093          | 95.2691 | 5.3624  |
| 0.0146        | 150.0 | 8100  | 0.0090          | 95.2691 | 5.367   |
| 0.0146        | 151.0 | 8154  | 0.0087          | 95.2691 | 5.3531  |
| 0.0146        | 152.0 | 8208  | 0.0090          | 95.2691 | 5.3484  |
| 0.0146        | 153.0 | 8262  | 0.0088          | 95.2691 | 5.3554  |
| 0.0146        | 154.0 | 8316  | 0.0088          | 94.728  | 5.3612  |
| 0.0146        | 155.0 | 8370  | 0.0086          | 95.2691 | 5.3554  |
| 0.0146        | 156.0 | 8424  | 0.0085          | 95.2691 | 5.3461  |
| 0.0146        | 157.0 | 8478  | 0.0085          | 95.2691 | 5.3415  |
| 0.0125        | 158.0 | 8532  | 0.0084          | 95.2691 | 5.3484  |
| 0.0125        | 159.0 | 8586  | 0.0086          | 95.2691 | 5.3647  |
| 0.0125        | 160.0 | 8640  | 0.0088          | 95.2691 | 5.3368  |
| 0.0125        | 161.0 | 8694  | 0.0086          | 95.2691 | 5.3415  |
| 0.0125        | 162.0 | 8748  | 0.0086          | 95.2691 | 5.3508  |
| 0.0125        | 163.0 | 8802  | 0.0087          | 95.2691 | 5.3647  |
| 0.0125        | 164.0 | 8856  | 0.0086          | 95.2691 | 5.3531  |
| 0.0125        | 165.0 | 8910  | 0.0086          | 95.2691 | 5.3461  |
| 0.0125        | 166.0 | 8964  | 0.0086          | 95.2691 | 5.3508  |
| 0.012         | 167.0 | 9018  | 0.0087          | 95.2691 | 5.3415  |
| 0.012         | 168.0 | 9072  | 0.0087          | 95.2691 | 5.3577  |
| 0.012         | 169.0 | 9126  | 0.0087          | 95.2691 | 5.3508  |
| 0.012         | 170.0 | 9180  | 0.0086          | 95.2691 | 5.36    |
| 0.012         | 171.0 | 9234  | 0.0086          | 95.2691 | 5.3577  |
| 0.012         | 172.0 | 9288  | 0.0086          | 95.2691 | 5.3717  |
| 0.012         | 173.0 | 9342  | 0.0084          | 95.2691 | 5.3624  |
| 0.012         | 174.0 | 9396  | 0.0085          | 95.2691 | 5.3647  |
| 0.012         | 175.0 | 9450  | 0.0084          | 95.2691 | 5.3577  |
| 0.0116        | 176.0 | 9504  | 0.0084          | 95.2691 | 5.3554  |
| 0.0116        | 177.0 | 9558  | 0.0083          | 95.2691 | 5.3438  |
| 0.0116        | 178.0 | 9612  | 0.0084          | 95.2691 | 5.36    |
| 0.0116        | 179.0 | 9666  | 0.0084          | 95.2691 | 5.36    |
| 0.0116        | 180.0 | 9720  | 0.0085          | 95.2691 | 5.3415  |
| 0.0116        | 181.0 | 9774  | 0.0084          | 95.2691 | 5.3484  |
| 0.0116        | 182.0 | 9828  | 0.0084          | 95.2691 | 5.3484  |
| 0.0116        | 183.0 | 9882  | 0.0084          | 95.2691 | 5.3461  |
| 0.0116        | 184.0 | 9936  | 0.0084          | 95.2691 | 5.3508  |
| 0.0116        | 185.0 | 9990  | 0.0083          | 95.2691 | 5.3438  |
| 0.0103        | 186.0 | 10044 | 0.0082          | 95.2691 | 5.3438  |
| 0.0103        | 187.0 | 10098 | 0.0083          | 95.2691 | 5.3484  |
| 0.0103        | 188.0 | 10152 | 0.0083          | 95.2691 | 5.3368  |
| 0.0103        | 189.0 | 10206 | 0.0083          | 95.2691 | 5.3415  |
| 0.0103        | 190.0 | 10260 | 0.0083          | 95.2691 | 5.3298  |
| 0.0103        | 191.0 | 10314 | 0.0083          | 95.2691 | 5.3275  |
| 0.0103        | 192.0 | 10368 | 0.0083          | 95.2691 | 5.3275  |
| 0.0103        | 193.0 | 10422 | 0.0083          | 95.2691 | 5.3252  |
| 0.0103        | 194.0 | 10476 | 0.0083          | 95.2691 | 5.3275  |
| 0.0105        | 195.0 | 10530 | 0.0083          | 95.2691 | 5.3275  |
| 0.0105        | 196.0 | 10584 | 0.0083          | 95.2691 | 5.3275  |
| 0.0105        | 197.0 | 10638 | 0.0083          | 95.2691 | 5.3275  |
| 0.0105        | 198.0 | 10692 | 0.0083          | 95.2691 | 5.3275  |
| 0.0105        | 199.0 | 10746 | 0.0083          | 95.2691 | 5.3275  |
| 0.0105        | 200.0 | 10800 | 0.0083          | 95.2691 | 5.3275  |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0