a48ec1a2b8d953467a138c847f76b924

This model is a fine-tuned version of google/long-t5-tglobal-xl on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.2650
Data Size: 1.0
Epoch Runtime: 1037.6208
Bleu: 11.4604

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	3.6018	0	57.4989	0.5910
No log	1	1407	2.2685	0.0078	63.8219	6.5966
No log	2	2814	2.1213	0.0156	72.9406	4.5586
0.0658	3	4221	1.9937	0.0312	87.6035	4.2299
2.336	4	5628	1.8831	0.0625	111.4291	5.0885
2.1408	5	7035	1.7577	0.125	156.6329	6.0891
1.897	6	8442	1.6166	0.25	246.9786	7.0618
1.7252	7	9849	1.4812	0.5	508.5712	8.2444
1.4902	8.0	11256	1.3508	1.0	1041.5764	9.3247
1.3339	9.0	12663	1.2828	1.0	1038.6518	9.9676
1.2421	10.0	14070	1.2406	1.0	915.8454	10.6155
1.0995	11.0	15477	1.2240	1.0	796.3828	10.8912
1.0029	12.0	16884	1.2129	1.0	1040.0637	11.1459
0.9351	13.0	18291	1.2243	1.0	1039.4535	11.2782
0.8686	14.0	19698	1.2328	1.0	1038.5060	11.3408
0.7856	15.0	21105	1.2455	1.0	1037.4875	11.4542
0.7209	16.0	22512	1.2650	1.0	1037.6208	11.4604

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.7B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/a48ec1a2b8d953467a138c847f76b924

Base model

google/long-t5-tglobal-xl

Finetuned

(49)

this model