flan-t5-rouge-durga-q5-clean-4b
This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.0021
- Rouge1: 0.7378
- Rouge2: 0.7126
- Rougel: 0.7379
- Rougelsum: 0.7390
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 24
- eval_batch_size: 24
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 60
Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|---|---|---|---|---|---|---|---|
| 2.0584 | 1.0 | 9 | 1.6093 | 0.2822 | 0.0866 | 0.2756 | 0.2752 |
| 1.9958 | 2.0 | 18 | 1.1569 | 0.3261 | 0.1042 | 0.3177 | 0.3186 |
| 1.174 | 3.0 | 27 | 0.8836 | 0.3770 | 0.1669 | 0.3656 | 0.3660 |
| 1.1673 | 4.0 | 36 | 0.6420 | 0.3646 | 0.1590 | 0.3569 | 0.3580 |
| 1.0302 | 5.0 | 45 | 0.4727 | 0.3987 | 0.2234 | 0.3940 | 0.3943 |
| 0.6135 | 6.0 | 54 | 0.3187 | 0.4167 | 0.2439 | 0.4102 | 0.4102 |
| 0.5838 | 7.0 | 63 | 0.2294 | 0.4542 | 0.3007 | 0.4478 | 0.4462 |
| 0.4479 | 8.0 | 72 | 0.1891 | 0.4618 | 0.3175 | 0.4579 | 0.4569 |
| 0.3936 | 9.0 | 81 | 0.1373 | 0.4664 | 0.3152 | 0.4624 | 0.4606 |
| 0.3307 | 10.0 | 90 | 0.1073 | 0.5085 | 0.3889 | 0.5069 | 0.5064 |
| 0.3624 | 11.0 | 99 | 0.0845 | 0.5074 | 0.3887 | 0.5061 | 0.5055 |
| 0.1817 | 12.0 | 108 | 0.0702 | 0.5456 | 0.4416 | 0.5444 | 0.5436 |
| 0.2335 | 13.0 | 117 | 0.0705 | 0.5132 | 0.4077 | 0.5136 | 0.5125 |
| 0.1604 | 14.0 | 126 | 0.0650 | 0.5486 | 0.4418 | 0.5464 | 0.5455 |
| 0.1306 | 15.0 | 135 | 0.0540 | 0.5469 | 0.4508 | 0.5468 | 0.5467 |
| 0.1194 | 16.0 | 144 | 0.0489 | 0.5935 | 0.5103 | 0.5925 | 0.5931 |
| 0.2133 | 17.0 | 153 | 0.0441 | 0.5746 | 0.4862 | 0.5732 | 0.5736 |
| 0.1035 | 18.0 | 162 | 0.0425 | 0.5799 | 0.4981 | 0.5786 | 0.5798 |
| 0.1049 | 19.0 | 171 | 0.0333 | 0.6341 | 0.5608 | 0.6325 | 0.6325 |
| 0.1165 | 20.0 | 180 | 0.0287 | 0.6398 | 0.5755 | 0.6390 | 0.6379 |
| 0.1197 | 21.0 | 189 | 0.0300 | 0.5988 | 0.5223 | 0.5995 | 0.5996 |
| 0.0607 | 22.0 | 198 | 0.0245 | 0.6465 | 0.5810 | 0.6458 | 0.6453 |
| 0.1443 | 23.0 | 207 | 0.0238 | 0.6454 | 0.5820 | 0.6475 | 0.6470 |
| 0.0727 | 24.0 | 216 | 0.0188 | 0.6769 | 0.6239 | 0.6764 | 0.6770 |
| 0.0462 | 25.0 | 225 | 0.0177 | 0.6926 | 0.6368 | 0.6918 | 0.6923 |
| 0.0804 | 26.0 | 234 | 0.0132 | 0.6979 | 0.6512 | 0.6975 | 0.6988 |
| 0.0337 | 27.0 | 243 | 0.0135 | 0.6971 | 0.6450 | 0.6970 | 0.6977 |
| 0.0459 | 28.0 | 252 | 0.0131 | 0.7019 | 0.6564 | 0.7019 | 0.7029 |
| 0.0233 | 29.0 | 261 | 0.0102 | 0.7089 | 0.6671 | 0.7096 | 0.7096 |
| 0.0228 | 30.0 | 270 | 0.0112 | 0.7057 | 0.6645 | 0.7055 | 0.7063 |
| 0.0435 | 31.0 | 279 | 0.0080 | 0.7125 | 0.6717 | 0.7117 | 0.7130 |
| 0.0364 | 32.0 | 288 | 0.0114 | 0.7108 | 0.6653 | 0.7102 | 0.7098 |
| 0.0112 | 33.0 | 297 | 0.0086 | 0.7184 | 0.6786 | 0.7182 | 0.7192 |
| 0.0325 | 34.0 | 306 | 0.0068 | 0.7268 | 0.6917 | 0.7267 | 0.7274 |
| 0.0173 | 35.0 | 315 | 0.0052 | 0.7327 | 0.7016 | 0.7317 | 0.7330 |
| 0.0599 | 36.0 | 324 | 0.0058 | 0.7291 | 0.6969 | 0.7297 | 0.7293 |
| 0.0125 | 37.0 | 333 | 0.0044 | 0.7336 | 0.7057 | 0.7338 | 0.7347 |
| 0.0155 | 38.0 | 342 | 0.0054 | 0.7238 | 0.6865 | 0.7241 | 0.7246 |
| 0.0199 | 39.0 | 351 | 0.0050 | 0.7293 | 0.6970 | 0.7294 | 0.7295 |
| 0.0109 | 40.0 | 360 | 0.0035 | 0.7348 | 0.7077 | 0.7352 | 0.7355 |
| 0.0229 | 41.0 | 369 | 0.0034 | 0.7348 | 0.7077 | 0.7352 | 0.7355 |
| 0.0353 | 42.0 | 378 | 0.0033 | 0.7348 | 0.7077 | 0.7352 | 0.7355 |
| 0.0124 | 43.0 | 387 | 0.0035 | 0.7357 | 0.7080 | 0.7359 | 0.7364 |
| 0.0147 | 44.0 | 396 | 0.0033 | 0.7330 | 0.7032 | 0.7333 | 0.7331 |
| 0.0055 | 45.0 | 405 | 0.0032 | 0.7322 | 0.7023 | 0.7324 | 0.7325 |
| 0.0183 | 46.0 | 414 | 0.0031 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.004 | 47.0 | 423 | 0.0033 | 0.7350 | 0.7069 | 0.7353 | 0.7365 |
| 0.0195 | 48.0 | 432 | 0.0032 | 0.7331 | 0.7019 | 0.7323 | 0.7333 |
| 0.0112 | 49.0 | 441 | 0.0031 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.0186 | 50.0 | 450 | 0.0029 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.0043 | 51.0 | 459 | 0.0028 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.011 | 52.0 | 468 | 0.0023 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.0203 | 53.0 | 477 | 0.0021 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.0099 | 54.0 | 486 | 0.0021 | 0.7377 | 0.7128 | 0.7376 | 0.7391 |
| 0.0095 | 55.0 | 495 | 0.0021 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.021 | 56.0 | 504 | 0.0021 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.0191 | 57.0 | 513 | 0.0022 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.0033 | 58.0 | 522 | 0.0021 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.0264 | 59.0 | 531 | 0.0021 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
| 0.0034 | 60.0 | 540 | 0.0021 | 0.7378 | 0.7126 | 0.7379 | 0.7390 |
Framework versions
- Transformers 4.46.0
- Pytorch 2.5.0+cu121
- Datasets 3.0.2
- Tokenizers 0.20.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for devagonal/flan-t5-rouge-durga-q5-clean-4b
Base model
google/flan-t5-base