112fec9ed42e51060f900d8f894eb446

This model is a fine-tuned version of google-t5/t5-large on the Helsinki-NLP/opus_books [de-pt] dataset. It achieves the following results on the evaluation set:

Loss: 1.6703
Data Size: 1.0
Epoch Runtime: 21.2101
Bleu: 7.4410

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	2.9596	0	1.7629	1.0230
No log	1	27	2.8965	0.0078	2.3567	1.1778
No log	2	54	2.7601	0.0156	3.8914	1.4456
No log	3	81	2.6535	0.0312	5.1172	1.7848
No log	4	108	2.5314	0.0625	6.3784	1.8423
No log	5	135	2.3265	0.125	8.5023	2.5052
No log	6	162	2.1458	0.25	11.4743	3.1025
No log	7	189	2.0311	0.5	15.1548	3.1577
0.5013	8.0	216	1.9043	1.0	20.2497	4.2677
0.5013	9.0	243	1.8183	1.0	18.0056	5.2331
1.9772	10.0	270	1.7607	1.0	18.4100	5.7080
1.9772	11.0	297	1.7316	1.0	19.6296	5.9091
1.7436	12.0	324	1.7052	1.0	18.8718	6.4142
1.5745	13.0	351	1.6868	1.0	19.1276	6.2425
1.5745	14.0	378	1.6810	1.0	19.4471	6.6954
1.4445	15.0	405	1.6673	1.0	19.4834	6.7889
1.4445	16.0	432	1.6606	1.0	20.6860	6.6549
1.3433	17.0	459	1.6665	1.0	22.0664	6.6621
1.3433	18.0	486	1.6586	1.0	20.9229	6.8744
1.2301	19.0	513	1.6651	1.0	21.5009	7.0757
1.2301	20.0	540	1.6635	1.0	21.4612	6.7474
1.1481	21.0	567	1.6693	1.0	20.9774	7.3536
1.1481	22.0	594	1.6703	1.0	21.2101	7.4410

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/112fec9ed42e51060f900d8f894eb446

Base model

google-t5/t5-large

Finetuned

(171)

this model