extract_long_text_unbalanced_smaller_6
This model is a fine-tuned version of weny22/sum_model_t5_saved on the None dataset. It achieves the following results on the evaluation set:
- Loss: 2.2687
- Rouge1: 0.2023
- Rouge2: 0.0727
- Rougel: 0.1638
- Rougelsum: 0.1637
- Gen Len: 18.9793
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.002
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|---|---|---|---|---|---|---|---|---|
| No log | 1.0 | 335 | 2.3733 | 0.1898 | 0.059 | 0.1502 | 0.1501 | 18.9833 |
| 3.2662 | 2.0 | 670 | 2.2397 | 0.185 | 0.0609 | 0.1493 | 0.1492 | 18.962 |
| 2.392 | 3.0 | 1005 | 2.1809 | 0.191 | 0.0655 | 0.1545 | 0.1544 | 18.9607 |
| 2.392 | 4.0 | 1340 | 2.1512 | 0.1923 | 0.0675 | 0.1552 | 0.1551 | 18.972 |
| 2.1606 | 5.0 | 1675 | 2.1182 | 0.1938 | 0.0689 | 0.1571 | 0.1571 | 18.974 |
| 2.0276 | 6.0 | 2010 | 2.1242 | 0.1944 | 0.0671 | 0.1576 | 0.1575 | 18.9733 |
| 2.0276 | 7.0 | 2345 | 2.1003 | 0.1944 | 0.0683 | 0.1578 | 0.1578 | 18.9833 |
| 1.8856 | 8.0 | 2680 | 2.1301 | 0.1993 | 0.0718 | 0.1619 | 0.162 | 18.9907 |
| 1.7948 | 9.0 | 3015 | 2.1103 | 0.1978 | 0.0692 | 0.1605 | 0.1605 | 18.978 |
| 1.7948 | 10.0 | 3350 | 2.1220 | 0.1999 | 0.0716 | 0.1628 | 0.1627 | 18.9773 |
| 1.683 | 11.0 | 3685 | 2.1290 | 0.1981 | 0.0699 | 0.1604 | 0.1606 | 18.982 |
| 1.6151 | 12.0 | 4020 | 2.1430 | 0.2024 | 0.0727 | 0.1632 | 0.1632 | 18.9793 |
| 1.6151 | 13.0 | 4355 | 2.1486 | 0.1976 | 0.0721 | 0.1606 | 0.1607 | 18.9813 |
| 1.5263 | 14.0 | 4690 | 2.1857 | 0.2032 | 0.0737 | 0.1645 | 0.1645 | 18.9873 |
| 1.4647 | 15.0 | 5025 | 2.2031 | 0.2029 | 0.0719 | 0.1639 | 0.164 | 18.9873 |
| 1.4647 | 16.0 | 5360 | 2.2044 | 0.2043 | 0.0744 | 0.1659 | 0.1659 | 18.9853 |
| 1.3972 | 17.0 | 5695 | 2.2325 | 0.2031 | 0.0724 | 0.1637 | 0.1638 | 18.9867 |
| 1.3577 | 18.0 | 6030 | 2.2473 | 0.2031 | 0.0724 | 0.164 | 0.1639 | 18.98 |
| 1.3577 | 19.0 | 6365 | 2.2544 | 0.2029 | 0.0733 | 0.1638 | 0.164 | 18.984 |
| 1.3132 | 20.0 | 6700 | 2.2687 | 0.2023 | 0.0727 | 0.1638 | 0.1637 | 18.9793 |
Framework versions
- Transformers 4.39.1
- Pytorch 2.2.2+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Sheng-Yen/extract_long_text_unbalanced_smaller_6
Base model
weny22/sum_model_t5_saved