e9793706acad19a2f1b2de93b4bd05ad

This model is a fine-tuned version of google-t5/t5-small on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

Loss: 1.2333
Data Size: 1.0
Epoch Runtime: 62.4257
Bleu: 6.7287

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	3.5153	0	6.0879	0.1499
No log	1	447	3.3034	0.0078	6.7053	0.1761
0.0534	2	894	2.9168	0.0156	6.3933	0.3345
0.0695	3	1341	2.6587	0.0312	7.3390	0.2405
0.1077	4	1788	2.5219	0.0625	9.7589	0.5060
0.1872	5	2235	2.3968	0.125	12.3938	0.4165
2.5361	6	2682	2.2921	0.25	20.1647	0.4993
2.3725	7	3129	2.1755	0.5	33.7213	0.7969
2.2278	8.0	3576	2.0506	1.0	65.3310	1.2631
2.1392	9.0	4023	1.9710	1.0	59.8108	1.8236
2.0917	10.0	4470	1.9135	1.0	63.5706	2.0827
2.0302	11.0	4917	1.8624	1.0	60.9909	2.1980
2.0009	12.0	5364	1.8201	1.0	58.3167	2.3737
1.9478	13.0	5811	1.7800	1.0	59.5104	2.6749
1.9062	14.0	6258	1.7443	1.0	62.0392	2.8530
1.8946	15.0	6705	1.7109	1.0	62.7633	2.9931
1.8473	16.0	7152	1.6837	1.0	67.1382	3.2108
1.8293	17.0	7599	1.6539	1.0	63.2569	3.2521
1.8043	18.0	8046	1.6283	1.0	64.8657	3.5135
1.7746	19.0	8493	1.6055	1.0	65.3130	3.6802
1.7476	20.0	8940	1.5824	1.0	63.0923	3.8136
1.7504	21.0	9387	1.5615	1.0	63.1112	3.9703
1.6967	22.0	9834	1.5403	1.0	62.8212	4.1350
1.675	23.0	10281	1.5224	1.0	61.3662	4.2455
1.6563	24.0	10728	1.5053	1.0	60.7117	4.4245
1.6528	25.0	11175	1.4848	1.0	65.6816	4.5583
1.6143	26.0	11622	1.4692	1.0	59.1256	4.6835
1.6241	27.0	12069	1.4557	1.0	61.9651	4.8235
1.5818	28.0	12516	1.4404	1.0	65.2822	4.9559
1.5555	29.0	12963	1.4248	1.0	62.0607	5.0545
1.5487	30.0	13410	1.4118	1.0	60.2353	5.2027
1.5477	31.0	13857	1.4000	1.0	60.2108	5.2571
1.5129	32.0	14304	1.3893	1.0	60.5813	5.3978
1.5345	33.0	14751	1.3763	1.0	61.8097	5.5671
1.4995	34.0	15198	1.3655	1.0	59.2048	5.5855
1.4859	35.0	15645	1.3556	1.0	60.9427	5.6993
1.4635	36.0	16092	1.3441	1.0	58.0727	5.7860
1.4432	37.0	16539	1.3368	1.0	58.5465	5.9151
1.4568	38.0	16986	1.3253	1.0	64.0433	5.9920
1.4298	39.0	17433	1.3160	1.0	59.5569	6.0769
1.3959	40.0	17880	1.3069	1.0	62.9847	6.1072
1.393	41.0	18327	1.2998	1.0	59.5837	6.2188
1.3913	42.0	18774	1.2923	1.0	60.4817	6.2844
1.4034	43.0	19221	1.2855	1.0	61.9362	6.3534
1.37	44.0	19668	1.2743	1.0	60.6214	6.4056
1.3536	45.0	20115	1.2665	1.0	63.8938	6.4811
1.3514	46.0	20562	1.2609	1.0	61.7827	6.5167
1.359	47.0	21009	1.2558	1.0	53.8058	6.5776
1.3249	48.0	21456	1.2452	1.0	58.4070	6.6900
1.3388	49.0	21903	1.2413	1.0	61.2955	6.7241
1.3113	50.0	22350	1.2333	1.0	62.4257	6.7287

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/e9793706acad19a2f1b2de93b4bd05ad

Base model

google-t5/t5-small

Finetuned

(2270)

this model