File size: 21,960 Bytes
09e734e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
---
library_name: transformers
license: cc-by-nc-4.0
base_model: facebook/mms-1b-all
tags:
- generated_from_trainer
metrics:
- wer
- bleu
- rouge
model-index:
- name: baseline_sim
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# baseline_sim

This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2816
- Wer: 0.4111
- Bleu: 0.4443
- Rouge: {'rouge1': 0.5431378694774602, 'rouge2': 0.46319971488463374, 'rougeL': 0.543319325066259, 'rougeLsum': 0.5433415661075893}

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 100
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch   | Step  | Validation Loss | Wer    | Bleu   | Rouge                                                                                                                         |
|:-------------:|:-------:|:-----:|:---------------:|:------:|:------:|:-----------------------------------------------------------------------------------------------------------------------------:|
| 2.1176        | 1.0     | 316   | 0.4477          | 0.5727 | 0.2807 | {'rouge1': 0.4701052724367798, 'rouge2': 0.3611529347993484, 'rougeL': 0.4691794826862792, 'rougeLsum': 0.4696052978313858}   |
| 0.602         | 2.0     | 632   | 0.3814          | 0.5405 | 0.3115 | {'rouge1': 0.4776896230695047, 'rouge2': 0.3698515651670966, 'rougeL': 0.47769636639646806, 'rougeLsum': 0.47713757408112845} |
| 0.5628        | 3.0     | 948   | 0.3781          | 0.5454 | 0.3022 | {'rouge1': 0.48165162767814085, 'rouge2': 0.377016908076784, 'rougeL': 0.48139639759906566, 'rougeLsum': 0.4809234187863734}  |
| 0.5369        | 4.0     | 1264  | 0.3828          | 0.5438 | 0.3298 | {'rouge1': 0.4799128679538609, 'rouge2': 0.37693506986258707, 'rougeL': 0.47900138317043967, 'rougeLsum': 0.4793741993097963} |
| 0.5164        | 5.0     | 1580  | 0.3567          | 0.5279 | 0.3232 | {'rouge1': 0.4841810746249126, 'rouge2': 0.3799552710722608, 'rougeL': 0.48354014412182156, 'rougeLsum': 0.4837832822948419}  |
| 0.5051        | 6.0     | 1896  | 0.3456          | 0.5034 | 0.3388 | {'rouge1': 0.4924306786696943, 'rouge2': 0.3935040295524592, 'rougeL': 0.4922117663849014, 'rougeLsum': 0.49201510795127246}  |
| 0.4927        | 7.0     | 2212  | 0.3493          | 0.5045 | 0.3382 | {'rouge1': 0.481375776908068, 'rouge2': 0.3787311988901129, 'rougeL': 0.4814765220458964, 'rougeLsum': 0.4809969574804486}    |
| 0.4832        | 8.0     | 2528  | 0.3303          | 0.4996 | 0.3445 | {'rouge1': 0.4946803662569613, 'rouge2': 0.3943683867902757, 'rougeL': 0.4946726262643269, 'rougeLsum': 0.4942021611683006}   |
| 0.473         | 9.0     | 2844  | 0.3151          | 0.4942 | 0.3514 | {'rouge1': 0.49671763976588457, 'rouge2': 0.3996629185182432, 'rougeL': 0.4965625704625888, 'rougeLsum': 0.4966950774550053}  |
| 0.4657        | 10.0    | 3160  | 0.3257          | 0.5028 | 0.3391 | {'rouge1': 0.49622226435779726, 'rouge2': 0.39534933906387604, 'rougeL': 0.495871123456577, 'rougeLsum': 0.49539946997954604} |
| 0.4582        | 11.0    | 3476  | 0.3315          | 0.5046 | 0.3428 | {'rouge1': 0.504731063678411, 'rouge2': 0.405773684536023, 'rougeL': 0.5047963279635649, 'rougeLsum': 0.5046296588492237}     |
| 0.4487        | 12.0    | 3792  | 0.3150          | 0.5126 | 0.3508 | {'rouge1': 0.5036259628111639, 'rouge2': 0.405868420995181, 'rougeL': 0.5037565267864342, 'rougeLsum': 0.5037138735732095}    |
| 0.4396        | 13.0    | 4108  | 0.3273          | 0.5028 | 0.3387 | {'rouge1': 0.4964672848959724, 'rouge2': 0.3963505715751717, 'rougeL': 0.4963496083100021, 'rougeLsum': 0.4965166016782854}   |
| 0.4332        | 14.0    | 4424  | 0.3081          | 0.4974 | 0.3664 | {'rouge1': 0.5044876388028294, 'rouge2': 0.4094930750944624, 'rougeL': 0.5039906295330501, 'rougeLsum': 0.5040206457239715}   |
| 0.434         | 15.0    | 4740  | 0.3221          | 0.5141 | 0.3525 | {'rouge1': 0.5125316853503548, 'rouge2': 0.41730706374512155, 'rougeL': 0.5124824721377286, 'rougeLsum': 0.5121343836452501}  |
| 0.4216        | 16.0    | 5056  | 0.3077          | 0.4797 | 0.3680 | {'rouge1': 0.5026941615068741, 'rouge2': 0.4068122664788285, 'rougeL': 0.5023756815131657, 'rougeLsum': 0.502476875893668}    |
| 0.4197        | 17.0    | 5372  | 0.3211          | 0.5029 | 0.3512 | {'rouge1': 0.5042143560595778, 'rouge2': 0.4075605298525744, 'rougeL': 0.504339551075053, 'rougeLsum': 0.5036754244293081}    |
| 0.4151        | 18.0    | 5688  | 0.3083          | 0.4852 | 0.3652 | {'rouge1': 0.5052730002189119, 'rouge2': 0.4085356600159926, 'rougeL': 0.504541614028222, 'rougeLsum': 0.5050471863733861}    |
| 0.4102        | 19.0    | 6004  | 0.3056          | 0.4853 | 0.3608 | {'rouge1': 0.5103746863007577, 'rouge2': 0.41713234947961464, 'rougeL': 0.5102496875412461, 'rougeLsum': 0.5105914668816096}  |
| 0.4065        | 20.0    | 6320  | 0.3060          | 0.4839 | 0.3660 | {'rouge1': 0.5066886248379499, 'rouge2': 0.41147147644412757, 'rougeL': 0.5062731205155445, 'rougeLsum': 0.5070894969158737}  |
| 0.3967        | 21.0    | 6636  | 0.2942          | 0.4668 | 0.3789 | {'rouge1': 0.5164350415868035, 'rouge2': 0.4220216839870954, 'rougeL': 0.5157510807347886, 'rougeLsum': 0.5159435077481445}   |
| 0.3894        | 22.0    | 6952  | 0.3059          | 0.4761 | 0.3680 | {'rouge1': 0.5202884008337207, 'rouge2': 0.426535380001694, 'rougeL': 0.5206388797087906, 'rougeLsum': 0.5201838917014147}    |
| 0.3894        | 23.0    | 7268  | 0.3161          | 0.4725 | 0.3731 | {'rouge1': 0.5170556350779272, 'rouge2': 0.4231271450499444, 'rougeL': 0.5168371609085032, 'rougeLsum': 0.5169744637718943}   |
| 0.3805        | 24.0    | 7584  | 0.3009          | 0.4732 | 0.3750 | {'rouge1': 0.5178798159838369, 'rouge2': 0.42420807830308693, 'rougeL': 0.5180429072561925, 'rougeLsum': 0.5171848326036026}  |
| 0.3771        | 25.0    | 7900  | 0.2973          | 0.4628 | 0.3875 | {'rouge1': 0.509987472688173, 'rouge2': 0.41428786280897695, 'rougeL': 0.509655390996804, 'rougeLsum': 0.5092013564139832}    |
| 0.3725        | 26.0    | 8216  | 0.2909          | 0.4683 | 0.3812 | {'rouge1': 0.5195869162050247, 'rouge2': 0.42739115664150606, 'rougeL': 0.5193552654170152, 'rougeLsum': 0.5193362883002752}  |
| 0.3661        | 27.0    | 8532  | 0.2919          | 0.4746 | 0.3797 | {'rouge1': 0.5202540587146447, 'rouge2': 0.42611553477197534, 'rougeL': 0.51976188567395, 'rougeLsum': 0.5195430592632275}    |
| 0.3651        | 28.0    | 8848  | 0.2964          | 0.4673 | 0.3816 | {'rouge1': 0.5159625646618995, 'rouge2': 0.421718677430831, 'rougeL': 0.5161539617871874, 'rougeLsum': 0.5149555250925026}    |
| 0.361         | 29.0    | 9164  | 0.3011          | 0.4685 | 0.3871 | {'rouge1': 0.5172066961977984, 'rouge2': 0.42508057717961883, 'rougeL': 0.5176461585751073, 'rougeLsum': 0.5166992656813065}  |
| 0.3537        | 30.0    | 9480  | 0.3022          | 0.4702 | 0.3809 | {'rouge1': 0.5142183010804419, 'rouge2': 0.4212281072788994, 'rougeL': 0.513672623868419, 'rougeLsum': 0.5134193270515841}    |
| 0.35          | 31.0    | 9796  | 0.2891          | 0.4574 | 0.3929 | {'rouge1': 0.5229580414263679, 'rouge2': 0.4346077325386656, 'rougeL': 0.5231434277189111, 'rougeLsum': 0.5231289950159839}   |
| 0.3497        | 32.0    | 10112 | 0.3190          | 0.4756 | 0.3779 | {'rouge1': 0.5240420964547278, 'rouge2': 0.43225173940743766, 'rougeL': 0.5240484200825326, 'rougeLsum': 0.5234499322159889}  |
| 0.3467        | 33.0    | 10428 | 0.3029          | 0.4714 | 0.3846 | {'rouge1': 0.5247309726491365, 'rouge2': 0.43484716342491325, 'rougeL': 0.523668806551089, 'rougeLsum': 0.5236143671656388}   |
| 0.3435        | 34.0    | 10744 | 0.3015          | 0.4644 | 0.3933 | {'rouge1': 0.5220928802347181, 'rouge2': 0.4323880918345855, 'rougeL': 0.5218215283702583, 'rougeLsum': 0.5215037571207151}   |
| 0.3383        | 35.0    | 11060 | 0.2958          | 0.4671 | 0.3884 | {'rouge1': 0.5201593846029072, 'rouge2': 0.4303305500919477, 'rougeL': 0.5202170582834349, 'rougeLsum': 0.5204392868791524}   |
| 0.3378        | 36.0    | 11376 | 0.2865          | 0.4505 | 0.3995 | {'rouge1': 0.5254980092622998, 'rouge2': 0.43868249419528177, 'rougeL': 0.5259994525174937, 'rougeLsum': 0.5255468410663021}  |
| 0.3339        | 37.0    | 11692 | 0.3019          | 0.4649 | 0.3864 | {'rouge1': 0.5289082934497475, 'rouge2': 0.4403604394064534, 'rougeL': 0.529104255127312, 'rougeLsum': 0.5289829740684285}    |
| 0.33          | 38.0    | 12008 | 0.2887          | 0.4553 | 0.3959 | {'rouge1': 0.5241924754733059, 'rouge2': 0.4337710132156277, 'rougeL': 0.5245062958600637, 'rougeLsum': 0.5237140450300832}   |
| 0.3252        | 39.0    | 12324 | 0.2923          | 0.4487 | 0.3982 | {'rouge1': 0.5282486990597959, 'rouge2': 0.43712499012325473, 'rougeL': 0.5281320402300653, 'rougeLsum': 0.5282969843722276}  |
| 0.3231        | 40.0    | 12640 | 0.2895          | 0.4526 | 0.3990 | {'rouge1': 0.5180806375921845, 'rouge2': 0.4274962499823852, 'rougeL': 0.5174285655272781, 'rougeLsum': 0.5180470787161557}   |
| 0.3178        | 41.0    | 12956 | 0.2827          | 0.4456 | 0.4046 | {'rouge1': 0.5288551004286756, 'rouge2': 0.43937417432884296, 'rougeL': 0.5289680502478359, 'rougeLsum': 0.528493928899491}   |
| 0.3169        | 42.0    | 13272 | 0.2892          | 0.4478 | 0.4027 | {'rouge1': 0.5333235906541813, 'rouge2': 0.4443603162781552, 'rougeL': 0.5330034010136978, 'rougeLsum': 0.5326406825150651}   |
| 0.31          | 43.0    | 13588 | 0.2802          | 0.4389 | 0.4129 | {'rouge1': 0.5304714101198026, 'rouge2': 0.44173185307676477, 'rougeL': 0.5300954189016169, 'rougeLsum': 0.5291506939812323}  |
| 0.3073        | 44.0    | 13904 | 0.2869          | 0.4371 | 0.4120 | {'rouge1': 0.5308041781407598, 'rouge2': 0.44432884401915645, 'rougeL': 0.5314521664266045, 'rougeLsum': 0.5307981451360451}  |
| 0.3071        | 45.0    | 14220 | 0.2814          | 0.4369 | 0.4120 | {'rouge1': 0.5342890753829816, 'rouge2': 0.44614556754179713, 'rougeL': 0.5335327097934472, 'rougeLsum': 0.5336625526854244}  |
| 0.3051        | 46.0    | 14536 | 0.3069          | 0.4511 | 0.4040 | {'rouge1': 0.5334566908733793, 'rouge2': 0.4464639465223492, 'rougeL': 0.5332841123240422, 'rougeLsum': 0.5339599203398355}   |
| 0.3076        | 47.0    | 14852 | 0.2819          | 0.4366 | 0.4157 | {'rouge1': 0.530656860856802, 'rouge2': 0.4408999090752492, 'rougeL': 0.5304515790394504, 'rougeLsum': 0.5301366492296686}    |
| 0.3014        | 48.0    | 15168 | 0.2807          | 0.4285 | 0.4192 | {'rouge1': 0.5361600419352455, 'rouge2': 0.4503578843363222, 'rougeL': 0.5359299885307938, 'rougeLsum': 0.5357384533775356}   |
| 0.2972        | 49.0    | 15484 | 0.2824          | 0.4358 | 0.4177 | {'rouge1': 0.530603352756585, 'rouge2': 0.4422960994397888, 'rougeL': 0.5301851986487092, 'rougeLsum': 0.5301276798933865}    |
| 0.2961        | 50.0    | 15800 | 0.2763          | 0.4345 | 0.4183 | {'rouge1': 0.5346699053721147, 'rouge2': 0.4484006985044071, 'rougeL': 0.5346650096835568, 'rougeLsum': 0.5346366210123671}   |
| 0.2901        | 51.0    | 16116 | 0.2807          | 0.4288 | 0.4229 | {'rouge1': 0.53348594544053, 'rouge2': 0.4455804499881205, 'rougeL': 0.5340921809742756, 'rougeLsum': 0.5333074099689908}     |
| 0.2894        | 52.0    | 16432 | 0.2793          | 0.4283 | 0.4206 | {'rouge1': 0.5335896965768283, 'rouge2': 0.4442361617178019, 'rougeL': 0.5330441366886094, 'rougeLsum': 0.5329383884271198}   |
| 0.2888        | 53.0    | 16748 | 0.2843          | 0.4282 | 0.4220 | {'rouge1': 0.5343661644986778, 'rouge2': 0.44887566681721847, 'rougeL': 0.5339094634205507, 'rougeLsum': 0.5337050579468707}  |
| 0.2829        | 54.0    | 17064 | 0.2835          | 0.4304 | 0.4220 | {'rouge1': 0.5348240478926541, 'rouge2': 0.4467007676505266, 'rougeL': 0.5342014008290759, 'rougeLsum': 0.53434981500928}     |
| 0.2832        | 55.0    | 17380 | 0.2822          | 0.4263 | 0.4242 | {'rouge1': 0.5339275785839641, 'rouge2': 0.4446465029673905, 'rougeL': 0.5333339620013073, 'rougeLsum': 0.5333133823561067}   |
| 0.2805        | 56.0    | 17696 | 0.2815          | 0.4232 | 0.4287 | {'rouge1': 0.5344781277148793, 'rouge2': 0.4476924581277197, 'rougeL': 0.534592162739483, 'rougeLsum': 0.5344700824900488}    |
| 0.2775        | 57.0    | 18012 | 0.2910          | 0.4372 | 0.4169 | {'rouge1': 0.5322912089641271, 'rouge2': 0.44439357593730966, 'rougeL': 0.53175473951791, 'rougeLsum': 0.5319678508912925}    |
| 0.274         | 58.0    | 18328 | 0.2769          | 0.4250 | 0.4266 | {'rouge1': 0.5327890209405353, 'rouge2': 0.4435068957908703, 'rougeL': 0.5324932585023235, 'rougeLsum': 0.5323035263916269}   |
| 0.2744        | 59.0    | 18644 | 0.2888          | 0.4347 | 0.4165 | {'rouge1': 0.5369185846025778, 'rouge2': 0.45081294950642925, 'rougeL': 0.5368148447616717, 'rougeLsum': 0.5363699979834334}  |
| 0.2693        | 60.0    | 18960 | 0.2833          | 0.4206 | 0.4267 | {'rouge1': 0.5379121286350527, 'rouge2': 0.44993082399298157, 'rougeL': 0.5374364350211912, 'rougeLsum': 0.5372883986121286}  |
| 0.2651        | 61.0    | 19276 | 0.2825          | 0.4232 | 0.4270 | {'rouge1': 0.5376823504804068, 'rouge2': 0.45156174386371273, 'rougeL': 0.5376195038173543, 'rougeLsum': 0.5376032731449271}  |
| 0.2668        | 62.0    | 19592 | 0.2811          | 0.4247 | 0.4238 | {'rouge1': 0.5369572261952279, 'rouge2': 0.45145007189396835, 'rougeL': 0.5369176891616814, 'rougeLsum': 0.5369281955458396}  |
| 0.2682        | 63.0    | 19908 | 0.2876          | 0.4292 | 0.4208 | {'rouge1': 0.5360702279479654, 'rouge2': 0.44952019776430263, 'rougeL': 0.5361599378420561, 'rougeLsum': 0.5360933388300235}  |
| 0.2638        | 64.0    | 20224 | 0.2850          | 0.4234 | 0.4280 | {'rouge1': 0.5355398000726144, 'rouge2': 0.44988599153311215, 'rougeL': 0.5357454617923272, 'rougeLsum': 0.5352237022672879}  |
| 0.2579        | 65.0    | 20540 | 0.2838          | 0.4209 | 0.4291 | {'rouge1': 0.5378115732530713, 'rouge2': 0.4534258803053122, 'rougeL': 0.5380832421302035, 'rougeLsum': 0.5372210890221913}   |
| 0.2603        | 66.0    | 20856 | 0.2890          | 0.4266 | 0.4257 | {'rouge1': 0.5381010514134678, 'rouge2': 0.45331667387209773, 'rougeL': 0.5375791745511427, 'rougeLsum': 0.5374021723492373}  |
| 0.255         | 67.0    | 21172 | 0.2818          | 0.4183 | 0.4350 | {'rouge1': 0.5355017433705689, 'rouge2': 0.44989734580071916, 'rougeL': 0.535482801756279, 'rougeLsum': 0.5354934130826885}   |
| 0.2551        | 68.0    | 21488 | 0.2810          | 0.4179 | 0.4357 | {'rouge1': 0.5375696553271616, 'rouge2': 0.45360577667106594, 'rougeL': 0.5377789946365092, 'rougeLsum': 0.5369677497904943}  |
| 0.251         | 69.0    | 21804 | 0.2834          | 0.4218 | 0.4304 | {'rouge1': 0.5396386043900199, 'rouge2': 0.4572424091851801, 'rougeL': 0.5393294726698683, 'rougeLsum': 0.5391327261552158}   |
| 0.2471        | 70.0    | 22120 | 0.2845          | 0.4200 | 0.4328 | {'rouge1': 0.5354877901006788, 'rouge2': 0.45082461339724056, 'rougeL': 0.5353444001146976, 'rougeLsum': 0.5354203896768015}  |
| 0.2525        | 71.0    | 22436 | 0.2854          | 0.4185 | 0.4327 | {'rouge1': 0.5379771980227901, 'rouge2': 0.4539120377352185, 'rougeL': 0.5374557940884827, 'rougeLsum': 0.5375123638442493}   |
| 0.2477        | 72.0    | 22752 | 0.2787          | 0.4172 | 0.4375 | {'rouge1': 0.5378203059744184, 'rouge2': 0.45272792564037556, 'rougeL': 0.5380627369669113, 'rougeLsum': 0.537467174358347}   |
| 0.2436        | 73.0    | 23068 | 0.2852          | 0.4161 | 0.4360 | {'rouge1': 0.5383278930625939, 'rouge2': 0.4520666490326796, 'rougeL': 0.5386092750984035, 'rougeLsum': 0.5376854664486965}   |
| 0.2416        | 74.0    | 23384 | 0.2915          | 0.4215 | 0.4325 | {'rouge1': 0.536873865355633, 'rouge2': 0.45262116911172134, 'rougeL': 0.5370586331180425, 'rougeLsum': 0.5367586836795812}   |
| 0.2436        | 75.0    | 23700 | 0.2893          | 0.4233 | 0.4281 | {'rouge1': 0.5377257941968341, 'rouge2': 0.45517921900071145, 'rougeL': 0.5379972695360247, 'rougeLsum': 0.5384993527592379}  |
| 0.2452        | 76.0    | 24016 | 0.2817          | 0.4145 | 0.4397 | {'rouge1': 0.5418625595871899, 'rouge2': 0.46056568946973386, 'rougeL': 0.541468446578519, 'rougeLsum': 0.5414941095256001}   |
| 0.2409        | 77.0    | 24332 | 0.2808          | 0.4143 | 0.4423 | {'rouge1': 0.5392257245009713, 'rouge2': 0.45607066336076896, 'rougeL': 0.538899825555043, 'rougeLsum': 0.5389811116142382}   |
| 0.2373        | 78.0    | 24648 | 0.2841          | 0.4188 | 0.4362 | {'rouge1': 0.5359340139841655, 'rouge2': 0.45263639310304327, 'rougeL': 0.5360656594382994, 'rougeLsum': 0.5360069674357852}  |
| 0.2393        | 79.0    | 24964 | 0.2809          | 0.4161 | 0.4386 | {'rouge1': 0.5391525866209272, 'rouge2': 0.4550841817969568, 'rougeL': 0.5389089291692388, 'rougeLsum': 0.5386840103505839}   |
| 0.2337        | 80.0    | 25280 | 0.2884          | 0.4242 | 0.4298 | {'rouge1': 0.5418567120863942, 'rouge2': 0.4566567449838128, 'rougeL': 0.5411095603319156, 'rougeLsum': 0.5402954951078773}   |
| 0.2334        | 81.0    | 25596 | 0.2824          | 0.4127 | 0.4392 | {'rouge1': 0.5418509341442581, 'rouge2': 0.45784578336087745, 'rougeL': 0.5426102015735883, 'rougeLsum': 0.5417528727173104}  |
| 0.2299        | 82.0    | 25912 | 0.2852          | 0.4165 | 0.4379 | {'rouge1': 0.5416079064420798, 'rouge2': 0.45834119455660505, 'rougeL': 0.5411153340448218, 'rougeLsum': 0.5408759317756201}  |
| 0.2277        | 83.0    | 26228 | 0.2890          | 0.4229 | 0.4330 | {'rouge1': 0.5394241274368741, 'rouge2': 0.45594996931284404, 'rougeL': 0.5396120164435794, 'rougeLsum': 0.5397634735014274}  |
| 0.2313        | 84.0    | 26544 | 0.2895          | 0.4245 | 0.4283 | {'rouge1': 0.5453446541322129, 'rouge2': 0.46280284024839646, 'rougeL': 0.5453884134746472, 'rougeLsum': 0.5452681697277351}  |
| 0.2315        | 85.0    | 26860 | 0.2792          | 0.4154 | 0.4396 | {'rouge1': 0.5429093525548536, 'rouge2': 0.4611084955713472, 'rougeL': 0.5423965358300354, 'rougeLsum': 0.5425649599783315}   |
| 0.2257        | 86.0    | 27176 | 0.2938          | 0.4263 | 0.4275 | {'rouge1': 0.5426098354732698, 'rouge2': 0.45980507697840733, 'rougeL': 0.5420677423797466, 'rougeLsum': 0.5422784837208043}  |
| 0.2269        | 87.0    | 27492 | 0.2832          | 0.4141 | 0.4417 | {'rouge1': 0.5407861890081991, 'rouge2': 0.45768316589935093, 'rougeL': 0.5399885330793596, 'rougeLsum': 0.5402799940195395}  |
| 0.2235        | 88.0    | 27808 | 0.2838          | 0.4139 | 0.4415 | {'rouge1': 0.5418634614310525, 'rouge2': 0.4584064085262245, 'rougeL': 0.5417885182640119, 'rougeLsum': 0.5413254286379844}   |
| 0.2274        | 89.0    | 28124 | 0.2850          | 0.4160 | 0.4371 | {'rouge1': 0.5405937575001661, 'rouge2': 0.458051805733929, 'rougeL': 0.5405110723843589, 'rougeLsum': 0.5404947581314099}    |
| 0.2242        | 90.0    | 28440 | 0.2797          | 0.4128 | 0.4418 | {'rouge1': 0.5429184092787075, 'rouge2': 0.45977757613442993, 'rougeL': 0.5427116656838797, 'rougeLsum': 0.5427156050781381}  |
| 0.2213        | 91.0    | 28756 | 0.2821          | 0.4143 | 0.4404 | {'rouge1': 0.5426471664268708, 'rouge2': 0.4593780623003551, 'rougeL': 0.5418104012736673, 'rougeLsum': 0.5418287085307663}   |
| 0.2199        | 92.0    | 29072 | 0.2800          | 0.4117 | 0.4438 | {'rouge1': 0.5440449456366141, 'rouge2': 0.4601436725607881, 'rougeL': 0.5434666475262009, 'rougeLsum': 0.5432628993810478}   |
| 0.2164        | 93.0    | 29388 | 0.2823          | 0.4138 | 0.4405 | {'rouge1': 0.5433433476910297, 'rouge2': 0.4619011422041214, 'rougeL': 0.5433923373560177, 'rougeLsum': 0.5431214493986625}   |
| 0.2208        | 94.0    | 29704 | 0.2805          | 0.4105 | 0.4449 | {'rouge1': 0.5433842079822675, 'rouge2': 0.4625397714923867, 'rougeL': 0.5436312942590825, 'rougeLsum': 0.5433888917507208}   |
| 0.2189        | 95.0    | 30020 | 0.2808          | 0.4104 | 0.4446 | {'rouge1': 0.5433769767494554, 'rouge2': 0.4621007632660318, 'rougeL': 0.5428793761331363, 'rougeLsum': 0.54301615334437}     |
| 0.2173        | 96.0    | 30336 | 0.2809          | 0.4104 | 0.4441 | {'rouge1': 0.5424258876287353, 'rouge2': 0.4612827742689153, 'rougeL': 0.5424412108802839, 'rougeLsum': 0.5428323083881738}   |
| 0.2156        | 97.0    | 30652 | 0.2815          | 0.4106 | 0.4434 | {'rouge1': 0.5442330389106769, 'rouge2': 0.4629085413716164, 'rougeL': 0.5439738249654837, 'rougeLsum': 0.5441096927446344}   |
| 0.2158        | 98.0    | 30968 | 0.2815          | 0.4103 | 0.4453 | {'rouge1': 0.5430795464214497, 'rouge2': 0.46218161676383207, 'rougeL': 0.5431238378491201, 'rougeLsum': 0.5427355598729996}  |
| 0.2119        | 99.0    | 31284 | 0.2812          | 0.4104 | 0.4445 | {'rouge1': 0.5438763456161022, 'rouge2': 0.462454778529061, 'rougeL': 0.543345705025007, 'rougeLsum': 0.5432414638117005}     |
| 0.2095        | 99.6846 | 31500 | 0.2816          | 0.4111 | 0.4443 | {'rouge1': 0.5431378694774602, 'rouge2': 0.46319971488463374, 'rougeL': 0.543319325066259, 'rougeLsum': 0.5433415661075893}   |


### Framework versions

- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0