File size: 4,199 Bytes
a125fc1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
library_name: transformers
license: apache-2.0
base_model: google/mt5-base
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: mt5-rouge-durga-q1-clean
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# mt5-rouge-durga-q1-clean

This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.7819
- Rouge1: 0.3074
- Rouge2: 0.0953
- Rougel: 0.3026
- Rougelsum: 0.3008

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 20
- eval_batch_size: 20
- seed: 42
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 30

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
| 15.8442       | 1.0   | 3    | 11.1246         | 0.0148 | 0.0015 | 0.0152 | 0.0151    |
| 13.0661       | 2.0   | 6    | 9.3553          | 0.0226 | 0.0052 | 0.0219 | 0.0217    |
| 11.7048       | 3.0   | 9    | 8.0317          | 0.0198 | 0.0029 | 0.0177 | 0.0190    |
| 8.87          | 4.0   | 12   | 7.1382          | 0.0461 | 0.0105 | 0.0423 | 0.0406    |
| 11.0893       | 5.0   | 15   | 6.7905          | 0.0611 | 0.0106 | 0.0512 | 0.0503    |
| 9.8787        | 6.0   | 18   | 6.5255          | 0.0900 | 0.0224 | 0.0800 | 0.0782    |
| 9.8189        | 7.0   | 21   | 6.7007          | 0.0944 | 0.0231 | 0.0876 | 0.0861    |
| 8.2022        | 8.0   | 24   | 6.2109          | 0.0953 | 0.0227 | 0.0899 | 0.0910    |
| 8.5899        | 9.0   | 27   | 5.9520          | 0.0965 | 0.0171 | 0.0897 | 0.0914    |
| 7.5305        | 10.0  | 30   | 5.5748          | 0.0855 | 0.0157 | 0.0841 | 0.0821    |
| 7.0381        | 11.0  | 33   | 5.2219          | 0.0622 | 0.0095 | 0.0592 | 0.0585    |
| 6.675         | 12.0  | 36   | 4.8006          | 0.0529 | 0.0048 | 0.0499 | 0.0489    |
| 7.4134        | 13.0  | 39   | 4.3795          | 0.0693 | 0.0079 | 0.0635 | 0.0610    |
| 5.8722        | 14.0  | 42   | 3.9322          | 0.1060 | 0.0128 | 0.1003 | 0.1009    |
| 4.5875        | 15.0  | 45   | 3.5017          | 0.1012 | 0.0069 | 0.0968 | 0.0968    |
| 5.3675        | 16.0  | 48   | 3.1927          | 0.0944 | 0.0020 | 0.0915 | 0.0913    |
| 4.2999        | 17.0  | 51   | 2.8956          | 0.0890 | 0.0091 | 0.0831 | 0.0849    |
| 4.3349        | 18.0  | 54   | 2.7138          | 0.1164 | 0.0074 | 0.1114 | 0.1128    |
| 3.9688        | 19.0  | 57   | 2.5350          | 0.1122 | 0.0    | 0.1122 | 0.1121    |
| 4.2931        | 20.0  | 60   | 2.4138          | 0.1122 | 0.0    | 0.1122 | 0.1121    |
| 3.8427        | 21.0  | 63   | 2.3127          | 0.1122 | 0.0    | 0.1122 | 0.1121    |
| 3.2991        | 22.0  | 66   | 2.2054          | 0.1122 | 0.0    | 0.1122 | 0.1121    |
| 3.1351        | 23.0  | 69   | 2.1069          | 0.1122 | 0.0    | 0.1122 | 0.1121    |
| 3.023         | 24.0  | 72   | 2.0208          | 0.1142 | 0.0    | 0.1140 | 0.1139    |
| 3.4366        | 25.0  | 75   | 1.9500          | 0.1793 | 0.0352 | 0.1713 | 0.1711    |
| 2.7941        | 26.0  | 78   | 1.9068          | 0.3104 | 0.0909 | 0.3016 | 0.3005    |
| 2.9454        | 27.0  | 81   | 1.8419          | 0.3086 | 0.0940 | 0.3009 | 0.2984    |
| 2.6117        | 28.0  | 84   | 1.8775          | 0.3135 | 0.0955 | 0.3086 | 0.3067    |
| 2.6785        | 29.0  | 87   | 1.7772          | 0.3020 | 0.0946 | 0.2987 | 0.2968    |
| 2.7523        | 30.0  | 90   | 1.7819          | 0.3074 | 0.0953 | 0.3026 | 0.3008    |


### Framework versions

- Transformers 4.46.1
- Pytorch 2.5.0+cu121
- Datasets 3.0.2
- Tokenizers 0.20.1