b420da0d4015d59dc273e790e147248c

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.2265
Data Size: 1.0
Epoch Runtime: 342.0698
Bleu: 12.4692

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	2.5940	0	25.4424	5.6836
No log	1	1407	2.3556	0.0078	27.9662	7.2555
No log	2	2814	2.1227	0.0156	31.0816	6.9116
0.0588	3	4221	1.9865	0.0312	37.5414	6.4007
2.1763	4	5628	1.8951	0.0625	45.2317	5.8319
2.0593	5	7035	1.7950	0.125	64.8266	6.6318
1.8833	6	8442	1.6827	0.25	103.5913	7.5779
1.793	7	9849	1.5686	0.5	185.0851	8.4030
1.6179	8.0	11256	1.4512	1.0	341.8349	9.5631
1.5134	9.0	12663	1.3847	1.0	371.6036	10.3144
1.4902	10.0	14070	1.3443	1.0	347.1290	10.8170
1.3874	11.0	15477	1.3189	1.0	387.9105	11.0464
1.3334	12.0	16884	1.2923	1.0	342.7037	11.4063
1.3085	13.0	18291	1.2760	1.0	352.7639	11.5290
1.2906	14.0	19698	1.2660	1.0	369.5020	11.6876
1.2321	15.0	21105	1.2499	1.0	341.5451	11.7774
1.217	16.0	22512	1.2437	1.0	348.1670	12.0046
1.1782	17.0	23919	1.2357	1.0	354.7813	11.9847
1.1637	18.0	25326	1.2354	1.0	324.6648	11.9945
1.1299	19.0	26733	1.2271	1.0	349.8055	12.0771
1.1066	20.0	28140	1.2211	1.0	341.9852	12.2001
1.0848	21.0	29547	1.2183	1.0	369.7122	12.1532
1.0629	22.0	30954	1.2167	1.0	352.1940	12.3542
1.0584	23.0	32361	1.2174	1.0	334.0109	12.4420
1.0196	24.0	33768	1.2174	1.0	362.3988	12.3138
1.0138	25.0	35175	1.2188	1.0	415.4663	12.2772
0.9972	26.0	36582	1.2161	1.0	342.5789	12.3754
0.9592	27.0	37989	1.2182	1.0	361.4101	12.3343
0.9458	28.0	39396	1.2189	1.0	338.6802	12.5021
0.9623	29.0	40803	1.2195	1.0	400.4501	12.4644
0.9322	30.0	42210	1.2265	1.0	342.0698	12.4692

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/b420da0d4015d59dc273e790e147248c

Base model

google-t5/t5-base

Finetuned

(727)

this model