0e8ce411bdb8af1151d05312cfd14286

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [fi-pl] dataset. It achieves the following results on the evaluation set:

Loss: 2.2659
Data Size: 1.0
Epoch Runtime: 22.6192
Bleu: 0.5000

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	3.7828	0	2.0980	0.0648
No log	1	70	3.6667	0.0078	2.3552	0.0625
No log	2	140	3.4363	0.0156	2.9979	0.0473
No log	3	210	3.3102	0.0312	3.3752	0.0606
No log	4	280	3.1056	0.0625	3.9195	0.1106
No log	5	350	2.9603	0.125	5.1352	0.1619
No log	6	420	2.8461	0.25	7.4230	0.2130
0.4877	7	490	2.7202	0.5	11.7368	0.1086
2.8953	8.0	560	2.6146	1.0	19.1842	0.1549
2.8149	9.0	630	2.5536	1.0	18.4865	0.1561
2.7168	10.0	700	2.5105	1.0	18.1600	0.2418
2.6733	11.0	770	2.4728	1.0	18.9439	0.2245
2.6435	12.0	840	2.4363	1.0	17.7156	0.2762
2.5711	13.0	910	2.4144	1.0	18.0877	0.2782
2.5489	14.0	980	2.3951	1.0	18.9879	0.2732
2.5005	15.0	1050	2.3725	1.0	19.8824	0.3534
2.4655	16.0	1120	2.3559	1.0	19.1626	0.3408
2.4327	17.0	1190	2.3393	1.0	23.7882	0.3929
2.4006	18.0	1260	2.3283	1.0	18.2179	0.3648
2.3874	19.0	1330	2.3206	1.0	18.5401	0.3680
2.3523	20.0	1400	2.3182	1.0	19.6670	0.3782
2.3153	21.0	1470	2.3023	1.0	19.0961	0.4020
2.3201	22.0	1540	2.2951	1.0	23.1245	0.4138
2.2745	23.0	1610	2.2863	1.0	21.4893	0.3832
2.2601	24.0	1680	2.2843	1.0	20.8304	0.3954
2.2317	25.0	1750	2.2787	1.0	20.6040	0.4217
2.205	26.0	1820	2.2704	1.0	18.7773	0.4089
2.2114	27.0	1890	2.2668	1.0	17.9859	0.3746
2.1741	28.0	1960	2.2660	1.0	18.2620	0.3602
2.1549	29.0	2030	2.2597	1.0	21.2368	0.4359
2.1288	30.0	2100	2.2599	1.0	18.8151	0.4055
2.1125	31.0	2170	2.2589	1.0	19.2003	0.4461
2.1311	32.0	2240	2.2618	1.0	20.1962	0.4371
2.0864	33.0	2310	2.2626	1.0	19.6470	0.4133
2.0582	34.0	2380	2.2598	1.0	19.9388	0.4607
2.0529	35.0	2450	2.2564	1.0	19.9569	0.4630
2.0324	36.0	2520	2.2644	1.0	19.5040	0.4655
2.0196	37.0	2590	2.2605	1.0	19.8486	0.4757
2.0079	38.0	2660	2.2615	1.0	18.6603	0.4800
1.9872	39.0	2730	2.2537	1.0	18.4687	0.4825
1.9719	40.0	2800	2.2624	1.0	25.1243	0.5026
1.9544	41.0	2870	2.2596	1.0	21.2633	0.4623
1.956	42.0	2940	2.2690	1.0	20.1632	0.4609
1.9242	43.0	3010	2.2659	1.0	22.6192	0.5000

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/0e8ce411bdb8af1151d05312cfd14286

Base model

google-t5/t5-base

Finetuned

(729)

this model