6d24a5eb768a198ae73b54fa54072735

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-nl] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9430
  • Data Size: 1.0
  • Epoch Runtime: 349.3459
  • Bleu: 1.4024

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 218.2100 0 25.0722 0.0029
No log 1 806 135.8632 0.0078 28.4186 0.0026
No log 2 1612 72.6377 0.0156 32.3281 0.0012
No log 3 2418 25.0309 0.0312 38.3946 0.0007
2.2782 4 3224 14.2780 0.0625 48.6758 0.0033
18.3293 5 4030 10.1719 0.125 69.3327 0.0069
12.7345 6 4836 7.7438 0.25 108.8321 0.0237
8.7676 7 5642 5.8010 0.5 189.2463 0.0223
5.9844 8.0 6448 4.0580 1.0 348.8722 0.0502
4.6451 9.0 7254 3.4634 1.0 352.2134 0.0946
4.0376 10.0 8060 3.0997 1.0 353.1616 0.1436
3.6333 11.0 8866 2.8809 1.0 350.2432 0.1828
3.4008 12.0 9672 2.7696 1.0 351.4368 0.1934
3.2335 13.0 10478 2.6920 1.0 350.9128 0.2562
3.0756 14.0 11284 2.6157 1.0 352.3415 0.2855
2.9579 15.0 12090 2.5355 1.0 351.0503 0.3491
2.8798 16.0 12896 2.5249 1.0 351.3586 0.3560
2.7897 17.0 13702 2.4580 1.0 350.1344 0.4317
2.7262 18.0 14508 2.4061 1.0 351.1663 0.4623
2.6962 19.0 15314 2.3797 1.0 349.7022 0.5310
2.6126 20.0 16120 2.3499 1.0 351.0222 0.5647
2.5806 21.0 16926 2.3145 1.0 351.7137 0.5577
2.5476 22.0 17732 2.2874 1.0 351.7715 0.5803
2.5206 23.0 18538 2.2583 1.0 353.4597 0.6327
2.4558 24.0 19344 2.2414 1.0 353.0918 0.6507
2.4214 25.0 20150 2.2215 1.0 352.0976 0.6475
2.3758 26.0 20956 2.1898 1.0 348.2238 0.7510
2.3366 27.0 21762 2.1735 1.0 349.3065 0.7297
2.3167 28.0 22568 2.1639 1.0 349.5059 0.7622
2.2714 29.0 23374 2.1394 1.0 351.9686 0.7805
2.2556 30.0 24180 2.1266 1.0 352.5284 0.8544
2.2305 31.0 24986 2.1105 1.0 355.9961 0.8285
2.1928 32.0 25792 2.1064 1.0 355.1299 0.8994
2.1717 33.0 26598 2.0856 1.0 351.6526 0.9018
2.1266 34.0 27404 2.0632 1.0 354.7243 0.9453
2.1371 35.0 28210 2.0498 1.0 352.0551 0.9594
2.0909 36.0 29016 2.0523 1.0 350.2846 0.9577
2.0623 37.0 29822 2.0360 1.0 351.6120 1.0107
2.0601 38.0 30628 2.0212 1.0 352.5911 1.0315
2.0236 39.0 31434 2.0179 1.0 349.9444 1.0745
1.9903 40.0 32240 1.9996 1.0 351.4016 1.1236
1.9811 41.0 33046 1.9928 1.0 351.0448 1.1127
1.9425 42.0 33852 1.9838 1.0 351.3833 1.1870
1.92 43.0 34658 1.9741 1.0 349.1373 1.2144
1.8854 44.0 35464 1.9738 1.0 352.3969 1.2406
1.8909 45.0 36270 1.9711 1.0 353.0046 1.2970
1.8646 46.0 37076 1.9539 1.0 349.8124 1.3181
1.8158 47.0 37882 1.9544 1.0 349.7610 1.3287
1.8049 48.0 38688 1.9410 1.0 346.4040 1.3900
1.7911 49.0 39494 1.9471 1.0 347.3523 1.4239
1.7596 50.0 40300 1.9430 1.0 349.3459 1.4024

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/6d24a5eb768a198ae73b54fa54072735

Finetuned
(38)
this model