| | --- |
| | library_name: transformers |
| | license: apache-2.0 |
| | base_model: t5-base |
| | tags: |
| | - generated_from_trainer |
| | metrics: |
| | - rouge |
| | model-index: |
| | - name: t5-base-rouge-squad-qg |
| | results: [] |
| | --- |
| | |
| | <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| | should probably proofread and complete it, then remove this comment. --> |
| |
|
| | # t5-base-rouge-squad-qg |
| |
|
| | This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on an unknown dataset. |
| | It achieves the following results on the evaluation set: |
| | - Loss: 0.3358 |
| | - Rouge1: 0.3098 |
| | - Rouge2: 0.0914 |
| | - Rougel: 0.2967 |
| | - Rougelsum: 0.3043 |
| |
|
| | ## Model description |
| |
|
| | More information needed |
| |
|
| | ## Intended uses & limitations |
| |
|
| | More information needed |
| |
|
| | ## Training and evaluation data |
| |
|
| | More information needed |
| |
|
| | ## Training procedure |
| |
|
| | ### Training hyperparameters |
| |
|
| | The following hyperparameters were used during training: |
| | - learning_rate: 0.0003 |
| | - train_batch_size: 24 |
| | - eval_batch_size: 24 |
| | - seed: 42 |
| | - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
| | - lr_scheduler_type: linear |
| | - num_epochs: 30 |
| |
|
| | ### Training results |
| |
|
| | | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |
| | |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:| |
| | | 4.2041 | 1.0 | 3 | 2.1811 | 0.1282 | 0.0401 | 0.1198 | 0.1266 | |
| | | 1.7635 | 2.0 | 6 | 0.6616 | 0.0 | 0.0 | 0.0 | 0.0 | |
| | | 0.8464 | 3.0 | 9 | 0.5626 | 0.0 | 0.0 | 0.0 | 0.0 | |
| | | 0.6561 | 4.0 | 12 | 0.4082 | 0.1282 | 0.0401 | 0.1198 | 0.1266 | |
| | | 0.723 | 5.0 | 15 | 0.3290 | 0.1282 | 0.0401 | 0.1198 | 0.1266 | |
| | | 0.2988 | 6.0 | 18 | 0.2900 | 0.3815 | 0.1296 | 0.3636 | 0.3763 | |
| | | 0.1786 | 7.0 | 21 | 0.2800 | 0.3815 | 0.1296 | 0.3636 | 0.3763 | |
| | | 0.2887 | 8.0 | 24 | 0.2849 | 0.4952 | 0.1959 | 0.4743 | 0.4872 | |
| | | 0.3224 | 9.0 | 27 | 0.2869 | 0.3017 | 0.0980 | 0.2813 | 0.2986 | |
| | | 0.5636 | 10.0 | 30 | 0.2889 | 0.3017 | 0.0980 | 0.2813 | 0.2986 | |
| | | 0.271 | 11.0 | 33 | 0.2968 | 0.2384 | 0.0879 | 0.2257 | 0.2335 | |
| | | 0.1144 | 12.0 | 36 | 0.3020 | 0.2473 | 0.0854 | 0.2343 | 0.2357 | |
| | | 0.1005 | 13.0 | 39 | 0.3084 | 0.2517 | 0.0914 | 0.2388 | 0.2492 | |
| | | 0.3569 | 14.0 | 42 | 0.3118 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.1051 | 15.0 | 45 | 0.3117 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.2862 | 16.0 | 48 | 0.3139 | 0.2950 | 0.1310 | 0.2818 | 0.2903 | |
| | | 0.2077 | 17.0 | 51 | 0.3178 | 0.2950 | 0.1310 | 0.2818 | 0.2903 | |
| | | 0.1055 | 18.0 | 54 | 0.3239 | 0.4408 | 0.1350 | 0.4223 | 0.4368 | |
| | | 0.1761 | 19.0 | 57 | 0.3325 | 0.4408 | 0.1350 | 0.4223 | 0.4368 | |
| | | 0.0704 | 20.0 | 60 | 0.3416 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.3277 | 21.0 | 63 | 0.3445 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.0859 | 22.0 | 66 | 0.3435 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.268 | 23.0 | 69 | 0.3412 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.1323 | 24.0 | 72 | 0.3378 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.0744 | 25.0 | 75 | 0.3351 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.1864 | 26.0 | 78 | 0.3343 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.1473 | 27.0 | 81 | 0.3341 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.0461 | 28.0 | 84 | 0.3346 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.0614 | 29.0 | 87 | 0.3354 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| | | 0.0766 | 30.0 | 90 | 0.3358 | 0.3098 | 0.0914 | 0.2967 | 0.3043 | |
| |
|
| |
|
| | ### Framework versions |
| |
|
| | - Transformers 4.47.0 |
| | - Pytorch 2.5.1+cu121 |
| | - Datasets 3.1.0 |
| | - Tokenizers 0.21.0 |
| |
|