b15b6753225b2dd26cb377a67098d3b8

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [it-pt] dataset. It achieves the following results on the evaluation set:

  • Loss: 6.5205
  • Data Size: 1.0
  • Epoch Runtime: 19.4832
  • Bleu: 0.3750

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 237.5808 0 1.7565 0.0047
No log 1 29 219.4165 0.0078 2.4461 0.0186
No log 2 58 206.7268 0.0156 3.7988 0.0138
No log 3 87 195.5772 0.0312 5.1655 0.0133
No log 4 116 175.3125 0.0625 6.7078 0.0181
No log 5 145 143.0906 0.125 9.3192 0.0199
17.0186 6 174 94.6019 0.25 11.9287 0.0097
17.0186 7 203 42.3782 0.5 14.1822 0.0037
17.0186 8.0 232 20.2820 1.0 21.5922 0.0067
35.4455 9.0 261 15.4973 1.0 19.8251 0.0189
35.4455 10.0 290 13.6906 1.0 20.5325 0.0165
22.7957 11.0 319 12.3677 1.0 20.1372 0.0335
22.7957 12.0 348 10.9040 1.0 19.5501 0.0352
18.4549 13.0 377 11.0217 1.0 20.4877 0.0181
15.8757 14.0 406 9.5505 1.0 19.9176 0.0425
15.8757 15.0 435 9.7615 1.0 20.4831 0.0280
14.4684 16.0 464 9.1320 1.0 19.5799 0.0413
14.4684 17.0 493 8.4326 1.0 19.8343 0.0903
13.1074 18.0 522 8.3413 1.0 20.3576 0.1056
12.2344 19.0 551 8.3180 1.0 19.6270 0.0970
12.2344 20.0 580 7.9973 1.0 20.2316 0.0740
11.3743 21.0 609 7.5731 1.0 19.8370 0.2492
11.3743 22.0 638 7.3466 1.0 19.5481 0.2408
10.6951 23.0 667 7.3575 1.0 19.7752 0.1385
10.6951 24.0 696 6.5999 1.0 19.4145 0.3983
10.0869 25.0 725 6.3966 1.0 19.8224 0.3704
9.5201 26.0 754 6.4392 1.0 19.5388 0.3759
9.5201 27.0 783 6.4110 1.0 20.3876 0.3772
8.9994 28.0 812 6.5906 1.0 19.3776 0.3860
8.9994 29.0 841 6.5205 1.0 19.4832 0.3750

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/b15b6753225b2dd26cb377a67098d3b8

Finetuned
(38)
this model