8dd8934fb108f95a911b5e26447ebb59

This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [es-it] dataset. It achieves the following results on the evaluation set:

Loss: 2.4803
Data Size: 1.0
Epoch Runtime: 166.0883
Bleu: 5.5138

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	11.8845	0	13.7090	0.2804
No log	1	721	11.6500	0.0078	16.0695	0.2535
No log	2	1442	11.2228	0.0156	17.5692	0.2833
0.3237	3	2163	10.3332	0.0312	21.3931	0.2913
1.0012	4	2884	8.5047	0.0625	26.5733	0.3455
8.5143	5	3605	4.9032	0.125	36.3316	1.2113
4.7327	6	4326	3.3371	0.25	53.9259	2.4238
3.8862	7	5047	3.0048	0.5	92.5966	3.2504
3.496	8.0	5768	2.8315	1.0	167.7369	3.7882
3.2634	9.0	6489	2.7450	1.0	167.3899	4.1156
3.1612	10.0	7210	2.6941	1.0	169.2978	4.3139
3.0244	11.0	7931	2.6593	1.0	167.3909	4.4523
2.9753	12.0	8652	2.6200	1.0	167.1947	4.6066
2.9187	13.0	9373	2.6001	1.0	167.9997	4.7045
2.8032	14.0	10094	2.5692	1.0	166.7811	4.8149
2.7953	15.0	10815	2.5495	1.0	167.2644	4.8632
2.7074	16.0	11536	2.5413	1.0	171.8496	4.9945
2.6703	17.0	12257	2.5240	1.0	169.8680	5.0501
2.647	18.0	12978	2.5165	1.0	168.5366	5.1315
2.587	19.0	13699	2.5175	1.0	168.9415	5.1387
2.5565	20.0	14420	2.4971	1.0	168.2197	5.2374
2.5245	21.0	15141	2.4939	1.0	167.6366	5.2264
2.4804	22.0	15862	2.4823	1.0	167.6528	5.2776
2.3955	23.0	16583	2.4925	1.0	170.3214	5.3348
2.3861	24.0	17304	2.4809	1.0	169.3975	5.3402
2.3862	25.0	18025	2.4828	1.0	168.1509	5.3585
2.3627	26.0	18746	2.4789	1.0	167.9541	5.4241
2.3147	27.0	19467	2.4747	1.0	168.6779	5.4486
2.2717	28.0	20188	2.4815	1.0	169.7344	5.4859
2.2451	29.0	20909	2.4765	1.0	168.6457	5.4487
2.2161	30.0	21630	2.4774	1.0	166.5616	5.5141
2.2003	31.0	22351	2.4803	1.0	166.0883	5.5138

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

1.0B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/8dd8934fb108f95a911b5e26447ebb59

Base model

google/umt5-base

Finetuned

(49)

this model