4fcd43a957da124a3f81d795864c253f

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [it-nl] dataset. It achieves the following results on the evaluation set:

Loss: 2.1210
Data Size: 1.0
Epoch Runtime: 16.9702
Bleu: 1.7228

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	7.2964	0	2.0150	0.0378
No log	1	58	7.0523	0.0078	3.0691	0.0380
No log	2	116	6.3568	0.0156	2.5926	0.0366
No log	3	174	4.3858	0.0312	3.1325	0.0350
No log	4	232	3.5530	0.0625	3.7164	0.0352
No log	5	290	3.3370	0.125	5.2073	0.1166
0.3724	6	348	3.1684	0.25	6.8893	0.1737
0.4801	7	406	3.0062	0.5	10.0026	0.2319
2.2931	8.0	464	2.8510	1.0	17.0363	0.2895
3.0606	9.0	522	2.7520	1.0	15.7885	0.3249
2.9447	10.0	580	2.6823	1.0	16.9531	0.4105
2.8883	11.0	638	2.6218	1.0	16.3062	0.5238
2.8155	12.0	696	2.5736	1.0	16.9113	0.5936
2.7154	13.0	754	2.5301	1.0	16.3177	0.7069
2.6636	14.0	812	2.4914	1.0	16.8766	0.7077
2.6134	15.0	870	2.4603	1.0	15.5421	0.7273
2.5753	16.0	928	2.4343	1.0	15.5822	0.7412
2.5487	17.0	986	2.4107	1.0	15.7972	0.7548
2.5071	18.0	1044	2.3833	1.0	16.0719	0.8996
2.4588	19.0	1102	2.3650	1.0	15.5916	0.8939
2.4187	20.0	1160	2.3417	1.0	15.4360	0.9182
2.3972	21.0	1218	2.3288	1.0	15.4669	0.9530
2.3644	22.0	1276	2.3072	1.0	15.5347	1.0255
2.3354	23.0	1334	2.2915	1.0	17.0580	1.0244
2.327	24.0	1392	2.2810	1.0	16.3253	1.0626
2.2743	25.0	1450	2.2695	1.0	16.6860	1.0378
2.2549	26.0	1508	2.2519	1.0	17.0812	1.1046
2.2211	27.0	1566	2.2391	1.0	16.1982	1.2396
2.2059	28.0	1624	2.2291	1.0	16.7995	1.2611
2.1911	29.0	1682	2.2251	1.0	16.3083	1.3159
2.169	30.0	1740	2.2053	1.0	16.8139	1.3979
2.1508	31.0	1798	2.1978	1.0	17.9502	1.4516
2.1105	32.0	1856	2.1978	1.0	20.0904	1.4484
2.0923	33.0	1914	2.1861	1.0	17.5856	1.4988
2.0768	34.0	1972	2.1782	1.0	16.6692	1.4526
2.0528	35.0	2030	2.1783	1.0	16.8417	1.4341
2.041	36.0	2088	2.1587	1.0	18.4354	1.5145
2.0259	37.0	2146	2.1591	1.0	17.1102	1.5224
1.9919	38.0	2204	2.1592	1.0	16.8403	1.5453
1.9706	39.0	2262	2.1536	1.0	16.6296	1.6359
1.9672	40.0	2320	2.1446	1.0	16.9772	1.6035
1.9359	41.0	2378	2.1397	1.0	16.9985	1.6362
1.9259	42.0	2436	2.1358	1.0	16.4695	1.6112
1.9155	43.0	2494	2.1339	1.0	16.9258	1.6644
1.8914	44.0	2552	2.1327	1.0	16.7596	1.6480
1.87	45.0	2610	2.1242	1.0	16.6838	1.6303
1.8576	46.0	2668	2.1288	1.0	16.4163	1.6069
1.844	47.0	2726	2.1282	1.0	16.3194	1.7287
1.8363	48.0	2784	2.1217	1.0	16.9935	1.6565
1.816	49.0	2842	2.1185	1.0	17.2204	1.7248
1.7916	50.0	2900	2.1210	1.0	16.9702	1.7228

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 3

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/4fcd43a957da124a3f81d795864c253f

Base model

google-t5/t5-base

Finetuned

(737)

this model