File size: 4,121 Bytes
ebcfac1
3d98fea
ebcfac1
 
 
 
3d98fea
 
ebcfac1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
base_model: google/mt5-base
library_name: transformers
license: apache-2.0
metrics:
- rouge
tags:
- generated_from_trainer
model-index:
- name: mt5-rouge-durga-2
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mt5-rouge-durga-2

This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0126
- Rouge1: 0.6270
- Rouge2: 0.6003
- Rougel: 0.6244
- Rougelsum: 0.6247

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 30

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
| 4.989         | 1.0   | 85   | 2.8197          | 0.2164 | 0.0941 | 0.1882 | 0.1883    |
| 3.116         | 2.0   | 170  | 2.0798          | 0.3122 | 0.1588 | 0.2604 | 0.2604    |
| 2.8357        | 3.0   | 255  | 1.5681          | 0.3446 | 0.1935 | 0.2953 | 0.2955    |
| 1.7776        | 4.0   | 340  | 1.1806          | 0.3324 | 0.1952 | 0.2895 | 0.2904    |
| 1.1881        | 5.0   | 425  | 0.9407          | 0.3533 | 0.2228 | 0.3088 | 0.3091    |
| 1.8511        | 6.0   | 510  | 0.6826          | 0.3971 | 0.2700 | 0.3644 | 0.3636    |
| 1.7178        | 7.0   | 595  | 0.5128          | 0.4194 | 0.3120 | 0.3894 | 0.3891    |
| 1.2772        | 8.0   | 680  | 0.3878          | 0.4590 | 0.3619 | 0.4311 | 0.4302    |
| 1.3577        | 9.0   | 765  | 0.2709          | 0.4729 | 0.3881 | 0.4499 | 0.4497    |
| 0.8291        | 10.0  | 850  | 0.2005          | 0.5006 | 0.4276 | 0.4748 | 0.4747    |
| 0.6825        | 11.0  | 935  | 0.1616          | 0.5411 | 0.4732 | 0.5215 | 0.5224    |
| 0.5006        | 12.0  | 1020 | 0.1182          | 0.5348 | 0.4782 | 0.5200 | 0.5196    |
| 0.5193        | 13.0  | 1105 | 0.1027          | 0.5446 | 0.4910 | 0.5269 | 0.5286    |
| 0.3933        | 14.0  | 1190 | 0.0881          | 0.5685 | 0.5200 | 0.5535 | 0.5548    |
| 0.1584        | 15.0  | 1275 | 0.0708          | 0.5719 | 0.5327 | 0.5629 | 0.5645    |
| 0.3657        | 16.0  | 1360 | 0.0646          | 0.5763 | 0.5315 | 0.5648 | 0.5659    |
| 0.2731        | 17.0  | 1445 | 0.0525          | 0.5908 | 0.5500 | 0.5844 | 0.5844    |
| 0.3466        | 18.0  | 1530 | 0.0511          | 0.5971 | 0.5596 | 0.5873 | 0.5886    |
| 0.1892        | 19.0  | 1615 | 0.0384          | 0.6044 | 0.5675 | 0.5991 | 0.5995    |
| 0.1684        | 20.0  | 1700 | 0.0328          | 0.6066 | 0.5744 | 0.6046 | 0.6050    |
| 0.0691        | 21.0  | 1785 | 0.0295          | 0.6057 | 0.5726 | 0.6020 | 0.6027    |
| 0.0326        | 22.0  | 1870 | 0.0243          | 0.6167 | 0.5872 | 0.6138 | 0.6146    |
| 0.1872        | 23.0  | 1955 | 0.0195          | 0.6188 | 0.5899 | 0.6149 | 0.6160    |
| 0.1372        | 24.0  | 2040 | 0.0183          | 0.6253 | 0.5961 | 0.6227 | 0.6233    |
| 0.0621        | 25.0  | 2125 | 0.0166          | 0.6239 | 0.5957 | 0.6211 | 0.6225    |
| 0.2539        | 26.0  | 2210 | 0.0161          | 0.6217 | 0.5926 | 0.6191 | 0.6200    |
| 0.2532        | 27.0  | 2295 | 0.0166          | 0.6195 | 0.5910 | 0.6166 | 0.6173    |
| 0.1158        | 28.0  | 2380 | 0.0145          | 0.6223 | 0.5943 | 0.6196 | 0.6202    |
| 0.3496        | 29.0  | 2465 | 0.0132          | 0.6241 | 0.5957 | 0.6212 | 0.6217    |
| 0.059         | 30.0  | 2550 | 0.0126          | 0.6270 | 0.6003 | 0.6244 | 0.6247    |


### Framework versions

- Transformers 4.44.2
- Pytorch 2.4.0+cu121
- Datasets 3.0.0
- Tokenizers 0.19.1