File size: 11,218 Bytes
02a26d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
---
library_name: transformers
license: apache-2.0
base_model: google/flan-t5-small
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: flan-t5-rouge-squad-qg-testc
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# flan-t5-rouge-squad-qg-testc

This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3164
- Rouge1: 0.3601
- Rouge2: 0.1205
- Rougel: 0.3353
- Rougelsum: 0.3469

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 80
- eval_batch_size: 80
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 320
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 160

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
| 62.2537       | 1.0   | 3    | 29.8253         | 0.0752 | 0.0204 | 0.0667 | 0.0672    |
| 51.2172       | 2.0   | 6    | 23.9235         | 0.0670 | 0.0190 | 0.0607 | 0.0609    |
| 42.2884       | 3.0   | 9    | 18.5434         | 0.0527 | 0.0175 | 0.0494 | 0.0496    |
| 33.4669       | 4.0   | 12   | 12.7318         | 0.0825 | 0.0407 | 0.0824 | 0.0827    |
| 25.4026       | 5.0   | 15   | 8.2492          | 0.0706 | 0.0377 | 0.0701 | 0.0704    |
| 20.0834       | 6.0   | 18   | 7.4425          | 0.0662 | 0.0349 | 0.0664 | 0.0666    |
| 16.9821       | 7.0   | 21   | 6.9817          | 0.0789 | 0.0347 | 0.0759 | 0.0776    |
| 14.7652       | 8.0   | 24   | 5.9311          | 0.0961 | 0.0435 | 0.0905 | 0.0937    |
| 12.7754       | 9.0   | 27   | 4.9313          | 0.1005 | 0.0444 | 0.0841 | 0.0916    |
| 11.6278       | 10.0  | 30   | 4.7801          | 0.1311 | 0.0577 | 0.1158 | 0.1214    |
| 10.6391       | 11.0  | 33   | 4.5698          | 0.0978 | 0.0372 | 0.0872 | 0.0905    |
| 9.9925        | 12.0  | 36   | 4.3618          | 0.1011 | 0.0440 | 0.0892 | 0.0934    |
| 9.4969        | 13.0  | 39   | 4.2002          | 0.1331 | 0.0582 | 0.1167 | 0.1221    |
| 9.0639        | 14.0  | 42   | 4.0571          | 0.1327 | 0.0564 | 0.1165 | 0.1217    |
| 8.7064        | 15.0  | 45   | 3.9041          | 0.1484 | 0.0628 | 0.1267 | 0.1337    |
| 8.3122        | 16.0  | 48   | 3.7252          | 0.1528 | 0.0543 | 0.1264 | 0.1345    |
| 8.0191        | 17.0  | 51   | 3.5352          | 0.1356 | 0.0502 | 0.1129 | 0.1204    |
| 7.7028        | 18.0  | 54   | 3.3741          | 0.1175 | 0.0426 | 0.0978 | 0.1045    |
| 7.3704        | 19.0  | 57   | 3.2478          | 0.1111 | 0.0460 | 0.0947 | 0.1004    |
| 7.059         | 20.0  | 60   | 3.1444          | 0.1052 | 0.0359 | 0.0842 | 0.0900    |
| 6.798         | 21.0  | 63   | 3.0483          | 0.1240 | 0.0444 | 0.0990 | 0.1057    |
| 6.6172        | 22.0  | 66   | 2.9450          | 0.1281 | 0.0416 | 0.1018 | 0.1100    |
| 6.397         | 23.0  | 69   | 2.8270          | 0.1294 | 0.0426 | 0.1058 | 0.1154    |
| 6.1434        | 24.0  | 72   | 2.6957          | 0.1192 | 0.0405 | 0.1011 | 0.1072    |
| 5.9183        | 25.0  | 75   | 2.5599          | 0.1207 | 0.0367 | 0.0974 | 0.1027    |
| 5.7236        | 26.0  | 78   | 2.4320          | 0.1124 | 0.0342 | 0.0950 | 0.1001    |
| 5.5052        | 27.0  | 81   | 2.3171          | 0.1121 | 0.0367 | 0.0961 | 0.1004    |
| 5.3234        | 28.0  | 84   | 2.2137          | 0.1322 | 0.0453 | 0.1132 | 0.1197    |
| 5.1292        | 29.0  | 87   | 2.1201          | 0.1464 | 0.0526 | 0.1248 | 0.1322    |
| 4.9497        | 30.0  | 90   | 2.0297          | 0.2406 | 0.0923 | 0.2102 | 0.2278    |
| 4.7775        | 31.0  | 93   | 1.9376          | 0.2581 | 0.0958 | 0.2229 | 0.2445    |
| 4.5872        | 32.0  | 96   | 1.8415          | 0.2783 | 0.0982 | 0.2442 | 0.2654    |
| 4.4228        | 33.0  | 99   | 1.7448          | 0.2993 | 0.1034 | 0.2699 | 0.2878    |
| 4.2818        | 34.0  | 102  | 1.6519          | 0.3106 | 0.1082 | 0.2810 | 0.2983    |
| 4.0818        | 35.0  | 105  | 1.5682          | 0.3280 | 0.1101 | 0.2963 | 0.3144    |
| 3.9575        | 36.0  | 108  | 1.4935          | 0.3307 | 0.1099 | 0.2992 | 0.3180    |
| 3.8176        | 37.0  | 111  | 1.4235          | 0.3464 | 0.1150 | 0.3140 | 0.3318    |
| 3.666         | 38.0  | 114  | 1.3551          | 0.3496 | 0.1166 | 0.3173 | 0.3349    |
| 3.5058        | 39.0  | 117  | 1.2865          | 0.3496 | 0.1166 | 0.3173 | 0.3349    |
| 3.3658        | 40.0  | 120  | 1.2200          | 0.3475 | 0.1166 | 0.3155 | 0.3334    |
| 3.2795        | 41.0  | 123  | 1.1562          | 0.3522 | 0.1179 | 0.3226 | 0.3386    |
| 3.1434        | 42.0  | 126  | 1.0954          | 0.3522 | 0.1179 | 0.3226 | 0.3386    |
| 3.0247        | 43.0  | 129  | 1.0422          | 0.3522 | 0.1179 | 0.3226 | 0.3386    |
| 2.9343        | 44.0  | 132  | 0.9925          | 0.3529 | 0.1185 | 0.3238 | 0.3393    |
| 2.8065        | 45.0  | 135  | 0.9465          | 0.3529 | 0.1185 | 0.3238 | 0.3393    |
| 2.7406        | 46.0  | 138  | 0.9023          | 0.3529 | 0.1185 | 0.3238 | 0.3393    |
| 2.6367        | 47.0  | 141  | 0.8608          | 0.3551 | 0.1166 | 0.3249 | 0.3427    |
| 2.4855        | 48.0  | 144  | 0.8197          | 0.3592 | 0.1172 | 0.3284 | 0.3457    |
| 2.4782        | 49.0  | 147  | 0.7803          | 0.3592 | 0.1172 | 0.3284 | 0.3457    |
| 2.3351        | 50.0  | 150  | 0.7463          | 0.3586 | 0.1149 | 0.3253 | 0.3446    |
| 2.2519        | 51.0  | 153  | 0.7154          | 0.3655 | 0.1169 | 0.3281 | 0.3517    |
| 2.1864        | 52.0  | 156  | 0.6861          | 0.3676 | 0.1171 | 0.3293 | 0.3530    |
| 2.128         | 53.0  | 159  | 0.6597          | 0.3676 | 0.1171 | 0.3293 | 0.3530    |
| 2.0668        | 54.0  | 162  | 0.6343          | 0.3676 | 0.1171 | 0.3293 | 0.3530    |
| 2.013         | 55.0  | 165  | 0.6098          | 0.3658 | 0.1167 | 0.3277 | 0.3506    |
| 1.9364        | 56.0  | 168  | 0.5872          | 0.3658 | 0.1167 | 0.3277 | 0.3506    |
| 1.8327        | 57.0  | 171  | 0.5655          | 0.3658 | 0.1167 | 0.3277 | 0.3506    |
| 1.7749        | 58.0  | 174  | 0.5456          | 0.3659 | 0.1169 | 0.3282 | 0.3506    |
| 1.7399        | 59.0  | 177  | 0.5276          | 0.3659 | 0.1169 | 0.3282 | 0.3506    |
| 1.7449        | 60.0  | 180  | 0.5110          | 0.3659 | 0.1169 | 0.3282 | 0.3506    |
| 1.6973        | 61.0  | 183  | 0.4959          | 0.3659 | 0.1169 | 0.3282 | 0.3506    |
| 1.5943        | 62.0  | 186  | 0.4822          | 0.3658 | 0.1185 | 0.3280 | 0.3506    |
| 1.5571        | 63.0  | 189  | 0.4703          | 0.3666 | 0.1194 | 0.3308 | 0.3519    |
| 1.5806        | 64.0  | 192  | 0.4589          | 0.3666 | 0.1194 | 0.3308 | 0.3519    |
| 1.5002        | 65.0  | 195  | 0.4471          | 0.3666 | 0.1194 | 0.3308 | 0.3519    |
| 1.4634        | 66.0  | 198  | 0.4356          | 0.3666 | 0.1194 | 0.3308 | 0.3519    |
| 1.4553        | 67.0  | 201  | 0.4250          | 0.3697 | 0.1216 | 0.3344 | 0.3559    |
| 1.4035        | 68.0  | 204  | 0.4164          | 0.3701 | 0.1222 | 0.3350 | 0.3561    |
| 1.4084        | 69.0  | 207  | 0.4090          | 0.3739 | 0.1255 | 0.3378 | 0.3594    |
| 1.3806        | 70.0  | 210  | 0.4023          | 0.3739 | 0.1255 | 0.3378 | 0.3594    |
| 1.3048        | 71.0  | 213  | 0.3957          | 0.3728 | 0.1254 | 0.3373 | 0.3576    |
| 1.2709        | 72.0  | 216  | 0.3891          | 0.3728 | 0.1254 | 0.3373 | 0.3576    |
| 1.2735        | 73.0  | 219  | 0.3828          | 0.3728 | 0.1254 | 0.3373 | 0.3576    |
| 1.2733        | 74.0  | 222  | 0.3768          | 0.3728 | 0.1254 | 0.3373 | 0.3576    |
| 1.2215        | 75.0  | 225  | 0.3715          | 0.3728 | 0.1254 | 0.3373 | 0.3576    |
| 1.2225        | 76.0  | 228  | 0.3669          | 0.3732 | 0.1255 | 0.3374 | 0.3580    |
| 1.1829        | 77.0  | 231  | 0.3628          | 0.3732 | 0.1255 | 0.3374 | 0.3580    |
| 1.162         | 78.0  | 234  | 0.3591          | 0.3722 | 0.1244 | 0.3362 | 0.3570    |
| 1.097         | 79.0  | 237  | 0.3556          | 0.3715 | 0.1236 | 0.3355 | 0.3564    |
| 1.1702        | 80.0  | 240  | 0.3519          | 0.3715 | 0.1236 | 0.3355 | 0.3564    |
| 1.1309        | 81.0  | 243  | 0.3483          | 0.3764 | 0.1259 | 0.3400 | 0.3608    |
| 1.0986        | 82.0  | 246  | 0.3451          | 0.3563 | 0.1178 | 0.3261 | 0.3428    |
| 1.1109        | 83.0  | 249  | 0.3422          | 0.3563 | 0.1178 | 0.3261 | 0.3428    |
| 1.0752        | 84.0  | 252  | 0.3397          | 0.3553 | 0.1167 | 0.3257 | 0.3417    |
| 1.0475        | 85.0  | 255  | 0.3374          | 0.3553 | 0.1167 | 0.3257 | 0.3417    |
| 1.0736        | 86.0  | 258  | 0.3353          | 0.3553 | 0.1167 | 0.3257 | 0.3417    |
| 1.0723        | 87.0  | 261  | 0.3333          | 0.3552 | 0.1176 | 0.3249 | 0.3409    |
| 1.0326        | 88.0  | 264  | 0.3314          | 0.3536 | 0.1165 | 0.3236 | 0.3394    |
| 1.0742        | 89.0  | 267  | 0.3297          | 0.3536 | 0.1165 | 0.3236 | 0.3394    |
| 1.0081        | 90.0  | 270  | 0.3281          | 0.3583 | 0.1172 | 0.3282 | 0.3447    |
| 1.0158        | 91.0  | 273  | 0.3266          | 0.3583 | 0.1172 | 0.3282 | 0.3447    |
| 1.032         | 92.0  | 276  | 0.3252          | 0.3632 | 0.1213 | 0.3330 | 0.3497    |
| 0.9778        | 93.0  | 279  | 0.3239          | 0.3632 | 0.1213 | 0.3330 | 0.3497    |
| 0.9834        | 94.0  | 282  | 0.3228          | 0.3624 | 0.1221 | 0.3323 | 0.3492    |
| 0.9913        | 95.0  | 285  | 0.3218          | 0.3624 | 0.1221 | 0.3323 | 0.3492    |
| 0.9963        | 96.0  | 288  | 0.3210          | 0.3624 | 0.1221 | 0.3323 | 0.3492    |
| 0.9759        | 97.0  | 291  | 0.3202          | 0.3605 | 0.1208 | 0.3344 | 0.3465    |
| 1.0011        | 98.0  | 294  | 0.3195          | 0.3605 | 0.1208 | 0.3344 | 0.3465    |
| 0.9895        | 99.0  | 297  | 0.3189          | 0.3605 | 0.1208 | 0.3344 | 0.3465    |
| 0.926         | 100.0 | 300  | 0.3183          | 0.3605 | 0.1208 | 0.3344 | 0.3465    |
| 0.9347        | 101.0 | 303  | 0.3178          | 0.3605 | 0.1208 | 0.3344 | 0.3465    |
| 1.0039        | 102.0 | 306  | 0.3173          | 0.3605 | 0.1208 | 0.3344 | 0.3465    |
| 0.9693        | 103.0 | 309  | 0.3170          | 0.3603 | 0.1205 | 0.3353 | 0.3469    |
| 0.9754        | 104.0 | 312  | 0.3167          | 0.3601 | 0.1205 | 0.3353 | 0.3469    |
| 0.9872        | 105.0 | 315  | 0.3165          | 0.3601 | 0.1205 | 0.3353 | 0.3469    |
| 1.0003        | 106.0 | 318  | 0.3164          | 0.3601 | 0.1205 | 0.3353 | 0.3469    |
| 1.9782        | 106.8 | 320  | 0.3164          | 0.3601 | 0.1205 | 0.3353 | 0.3469    |


### Framework versions

- Transformers 4.47.1
- Pytorch 2.5.1+cu121
- Datasets 3.2.0
- Tokenizers 0.21.0