f5d3ae17b1930dbf87f1b7283f9ed30c

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

Loss: 1.2291
Data Size: 1.0
Epoch Runtime: 15.0556
Bleu: 14.7465

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	2.9686	0	1.7897	1.7611
No log	1	35	2.9335	0.0078	2.3796	1.8142
No log	2	70	2.8721	0.0156	2.0602	1.9906
No log	3	105	2.7471	0.0312	2.8569	2.2344
No log	4	140	2.6188	0.0625	3.2800	2.4158
No log	5	175	2.4610	0.125	4.1849	2.7923
No log	6	210	2.2644	0.25	5.1989	3.8902
No log	7	245	2.0473	0.5	8.8507	4.9557
0.4917	8.0	280	1.8653	1.0	14.1202	6.9420
2.2497	9.0	315	1.7499	1.0	13.9608	7.6192
2.0342	10.0	350	1.6686	1.0	12.5722	8.9947
2.0342	11.0	385	1.5980	1.0	13.6303	9.2455
1.8555	12.0	420	1.5371	1.0	13.5858	10.1462
1.7107	13.0	455	1.4986	1.0	10.9217	10.5372
1.7107	14.0	490	1.4559	1.0	10.9373	9.8687
1.5995	15.0	525	1.4161	1.0	10.3431	10.7604
1.53	16.0	560	1.3893	1.0	10.7269	11.3155
1.53	17.0	595	1.3668	1.0	12.1763	11.5311
1.4222	18.0	630	1.3408	1.0	11.1285	11.4685
1.3365	19.0	665	1.3257	1.0	13.1371	12.1346
1.283	20.0	700	1.3018	1.0	13.3854	12.1569
1.283	21.0	735	1.2920	1.0	14.0139	12.5417
1.2155	22.0	770	1.2809	1.0	13.5094	12.6887
1.1635	23.0	805	1.2769	1.0	11.9000	13.0757
1.1635	24.0	840	1.2598	1.0	14.3637	13.3410
1.0967	25.0	875	1.2532	1.0	14.3873	13.5607
1.0632	26.0	910	1.2479	1.0	14.0800	13.8649
1.0632	27.0	945	1.2404	1.0	14.7500	13.7716
1.0084	28.0	980	1.2475	1.0	13.1077	13.9105
0.9661	29.0	1015	1.2323	1.0	15.0026	13.9146
0.9381	30.0	1050	1.2244	1.0	14.5107	14.0155
0.9381	31.0	1085	1.2323	1.0	14.6993	14.1804
0.8814	32.0	1120	1.2260	1.0	13.6416	14.1333
0.8562	33.0	1155	1.2231	1.0	13.8906	14.8269
0.8562	34.0	1190	1.2253	1.0	15.0576	14.2483
0.8088	35.0	1225	1.2243	1.0	13.6468	14.5773
0.7953	36.0	1260	1.2268	1.0	13.1516	14.5130
0.7953	37.0	1295	1.2291	1.0	15.0556	14.7465

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/f5d3ae17b1930dbf87f1b7283f9ed30c

Base model

google-t5/t5-base

Finetuned

(723)

this model