1bd08fa5da0ba99f11bbb3204e38e87a

This model is a fine-tuned version of google/long-t5-tglobal-xl on the Helsinki-NLP/opus_books [fr-sv] dataset. It achieves the following results on the evaluation set:

Loss: 1.9529
Data Size: 1.0
Epoch Runtime: 53.4214
Bleu: 2.2952

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	3.3643	0	3.8097	0.0669
No log	1	75	2.9546	0.0078	4.6622	0.1271
No log	2	150	2.6190	0.0156	8.4840	0.4786
No log	3	225	2.5353	0.0312	14.4059	0.8589
No log	4	300	2.4672	0.0625	21.4861	1.0446
No log	5	375	2.4066	0.125	23.6793	0.8828
No log	6	450	2.3163	0.25	25.7230	0.8248
No log	7	525	2.2116	0.5	38.9069	1.0445
2.4037	8.0	600	2.1012	1.0	61.8496	1.4186
2.2503	9.0	675	2.0348	1.0	49.8702	1.4885
2.1004	10.0	750	1.9838	1.0	51.3868	1.5857
1.9788	11.0	825	1.9570	1.0	54.0241	1.7988
1.8625	12.0	900	1.9237	1.0	55.7399	1.9374
1.7561	13.0	975	1.9067	1.0	50.0868	1.9793
1.6706	14.0	1050	1.9007	1.0	53.6020	2.0393
1.5622	15.0	1125	1.9039	1.0	50.3454	2.0784
1.4761	16.0	1200	1.8982	1.0	54.7772	2.1199
1.3992	17.0	1275	1.9283	1.0	51.3246	2.2065
1.3337	18.0	1350	1.9218	1.0	53.9708	2.2737
1.2445	19.0	1425	1.9272	1.0	51.1263	2.2898
1.1893	20.0	1500	1.9529	1.0	53.4214	2.2952

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.7B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/1bd08fa5da0ba99f11bbb3204e38e87a

Base model

google/long-t5-tglobal-xl

Finetuned

(49)

this model