a1a6aff4ce2d90f17a750fdcaf097651

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [de-pt] dataset. It achieves the following results on the evaluation set:

Loss: 1.9747
Data Size: 1.0
Epoch Runtime: 10.4799
Bleu: 5.6590

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	3.5795	0	1.5110	0.8069
No log	1	27	3.5288	0.0078	2.0713	0.8193
No log	2	54	3.4226	0.0156	1.8171	0.8765
No log	3	81	3.3137	0.0312	2.1961	0.8726
No log	4	108	3.2241	0.0625	2.6187	0.9015
No log	5	135	3.0794	0.125	3.5629	0.9034
No log	6	162	2.8764	0.25	4.5319	1.6239
No log	7	189	2.6789	0.5	7.1745	1.8214
0.6673	8.0	216	2.5209	1.0	11.8647	2.1250
0.6673	9.0	243	2.4311	1.0	12.2110	2.9570
2.7346	10.0	270	2.3574	1.0	11.9172	3.0805
2.7346	11.0	297	2.3070	1.0	13.7312	3.3673
2.5374	12.0	324	2.2623	1.0	12.1588	3.6163
2.3816	13.0	351	2.2264	1.0	12.0713	3.4099
2.3816	14.0	378	2.1950	1.0	10.7154	3.7960
2.2706	15.0	405	2.1645	1.0	9.9418	4.1388
2.2706	16.0	432	2.1396	1.0	11.5808	4.2158
2.1735	17.0	459	2.1212	1.0	12.5215	4.3524
2.1735	18.0	486	2.1062	1.0	12.6577	4.3225
2.0687	19.0	513	2.0859	1.0	13.0955	4.5394
2.0687	20.0	540	2.0751	1.0	14.9231	4.4761
1.9922	21.0	567	2.0550	1.0	11.4518	4.8162
1.9922	22.0	594	2.0431	1.0	13.6035	4.8632
1.9275	23.0	621	2.0416	1.0	11.6460	4.6598
1.9275	24.0	648	2.0323	1.0	12.5673	4.7531
1.8617	25.0	675	2.0227	1.0	9.9415	4.9116
1.8078	26.0	702	2.0161	1.0	9.5311	4.7579
1.8078	27.0	729	2.0159	1.0	11.9414	4.7400
1.7452	28.0	756	1.9923	1.0	11.1788	5.1620
1.7452	29.0	783	1.9901	1.0	11.7848	5.1908
1.6985	30.0	810	1.9933	1.0	11.6818	5.2161
1.6985	31.0	837	1.9735	1.0	11.0877	5.2446
1.6343	32.0	864	1.9759	1.0	12.3549	5.2742
1.6343	33.0	891	1.9756	1.0	13.1103	5.0968
1.5958	34.0	918	1.9697	1.0	11.9184	5.3222
1.5958	35.0	945	1.9672	1.0	12.3234	5.3265
1.5497	36.0	972	1.9727	1.0	12.7994	5.2308
1.5497	37.0	999	1.9713	1.0	10.8372	5.2037
1.5086	38.0	1026	1.9620	1.0	11.8187	5.2829
1.4591	39.0	1053	1.9762	1.0	12.3488	5.2213
1.4591	40.0	1080	1.9759	1.0	12.8681	5.4821
1.4181	41.0	1107	1.9682	1.0	11.7268	5.5189
1.4181	42.0	1134	1.9747	1.0	10.4799	5.6590

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/a1a6aff4ce2d90f17a750fdcaf097651

Base model

google-t5/t5-base

Finetuned

(728)

this model