d26cf7cdd3361a406b5a8847290fd5e4

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [es-nl] dataset. It achieves the following results on the evaluation set:

Loss: 1.3742
Data Size: 1.0
Epoch Runtime: 196.8758
Bleu: 5.5411

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	4.6979	0	14.6178	0.2630
No log	1	806	3.6159	0.0078	16.9507	0.2817
No log	2	1612	3.3672	0.0156	23.6813	0.5189
No log	3	2418	3.1733	0.0312	24.9777	0.7731
0.1146	4	3224	2.9904	0.0625	29.4321	0.8201
3.1453	5	4030	2.8128	0.125	39.6603	1.0469
2.8842	6	4836	2.6172	0.25	66.1918	1.4022
2.6571	7	5642	2.4012	0.5	111.1433	1.7130
2.4164	8.0	6448	2.1537	1.0	199.0575	2.2937
2.2202	9.0	7254	2.0124	1.0	254.5317	2.6837
2.1099	10.0	8060	1.9145	1.0	188.3039	3.0100
2.0312	11.0	8866	1.8371	1.0	203.0337	3.2214
1.9358	12.0	9672	1.7809	1.0	196.8787	3.4941
1.8529	13.0	10478	1.7283	1.0	199.6227	3.6844
1.8137	14.0	11284	1.6859	1.0	198.4747	3.8336
1.7481	15.0	12090	1.6494	1.0	209.4039	4.0058
1.7189	16.0	12896	1.6176	1.0	206.3659	4.1225
1.6658	17.0	13702	1.5916	1.0	197.9211	4.2932
1.6082	18.0	14508	1.5679	1.0	184.3646	4.3829
1.617	19.0	15314	1.5486	1.0	192.4458	4.4728
1.5614	20.0	16120	1.5328	1.0	192.4261	4.5663
1.5586	21.0	16926	1.5142	1.0	194.4707	4.6566
1.5214	22.0	17732	1.4990	1.0	186.9885	4.7185
1.4982	23.0	18538	1.4831	1.0	192.4599	4.8176
1.4638	24.0	19344	1.4710	1.0	192.0883	4.8803
1.4256	25.0	20150	1.4590	1.0	180.2322	4.9197
1.4012	26.0	20956	1.4495	1.0	189.2897	4.9845
1.381	27.0	21762	1.4387	1.0	190.0461	4.9851
1.3463	28.0	22568	1.4314	1.0	188.5894	5.0749
1.3294	29.0	23374	1.4214	1.0	188.9541	5.1148
1.307	30.0	24180	1.4162	1.0	190.2317	5.1473
1.3011	31.0	24986	1.4068	1.0	194.5933	5.1761
1.2742	32.0	25792	1.4034	1.0	194.3170	5.1862
1.2704	33.0	26598	1.3986	1.0	185.8186	5.2994
1.234	34.0	27404	1.3983	1.0	184.7349	5.2958
1.2286	35.0	28210	1.3884	1.0	199.1204	5.2684
1.2136	36.0	29016	1.3850	1.0	200.2956	5.3466
1.2069	37.0	29822	1.3850	1.0	201.4939	5.3720
1.1825	38.0	30628	1.3770	1.0	181.7980	5.4159
1.1792	39.0	31434	1.3748	1.0	190.0694	5.3824
1.143	40.0	32240	1.3784	1.0	191.3387	5.4233
1.1568	41.0	33046	1.3711	1.0	195.9231	5.3870
1.1281	42.0	33852	1.3752	1.0	190.9018	5.4556
1.1068	43.0	34658	1.3685	1.0	192.1316	5.3978
1.0783	44.0	35464	1.3740	1.0	195.1085	5.5002
1.0909	45.0	36270	1.3764	1.0	190.8353	5.4630
1.0735	46.0	37076	1.3695	1.0	199.1337	5.4816
1.0387	47.0	37882	1.3649	1.0	196.5281	5.4680
1.0487	48.0	38688	1.3667	1.0	190.7721	5.5345
1.0383	49.0	39494	1.3725	1.0	193.5127	5.5066
1.0199	50.0	40300	1.3742	1.0	196.8758	5.5411

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/d26cf7cdd3361a406b5a8847290fd5e4

Base model

google-t5/t5-base

Finetuned

(727)

this model