2ceaea9e502b528e9538edb1cb9746aa

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [it-sv] dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1414
  • Data Size: 1.0
  • Epoch Runtime: 38.1692
  • Bleu: 0.2150

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 221.5325 0 3.0682 0.0017
No log 1 74 198.9404 0.0078 4.5653 0.0030
No log 2 148 168.5888 0.0156 5.5722 0.0038
6.7009 3 222 143.3611 0.0312 7.2210 0.0039
6.7009 4 296 113.6552 0.0625 9.3702 0.0033
9.5048 5 370 64.5368 0.125 12.2335 0.0015
9.5048 6 444 28.8386 0.25 16.2880 0.0032
13.9289 7 518 16.8321 0.5 23.9353 0.0021
16.6125 8.0 592 11.4163 1.0 39.7476 0.0030
16.3322 9.0 666 9.6635 1.0 37.4544 0.0030
14.8335 10.0 740 9.1268 1.0 37.3580 0.0034
12.4568 11.0 814 8.0397 1.0 37.2919 0.0093
11.6423 12.0 888 7.3618 1.0 37.3146 0.0065
10.6014 13.0 962 7.2995 1.0 37.3232 0.0111
10.1262 14.0 1036 6.5569 1.0 37.7075 0.0083
9.2616 15.0 1110 6.1059 1.0 37.4836 0.0434
8.889 16.0 1184 5.7652 1.0 37.4712 0.0298
8.3487 17.0 1258 5.4665 1.0 37.5588 0.0884
8.0774 18.0 1332 5.2253 1.0 37.0616 0.0377
7.5902 19.0 1406 5.3379 1.0 37.2366 0.0450
7.404 20.0 1480 5.1511 1.0 37.1286 0.0331
7.0 21.0 1554 4.8344 1.0 37.0656 0.0798
6.8283 22.0 1628 4.8185 1.0 37.7522 0.0704
6.5355 23.0 1702 4.6294 1.0 37.0545 0.0906
6.4027 24.0 1776 4.4492 1.0 36.9281 0.0587
6.0659 25.0 1850 4.3216 1.0 37.2367 0.1275
5.9587 26.0 1924 4.2734 1.0 37.1520 0.1169
5.8082 27.0 1998 4.1049 1.0 37.4815 0.1064
5.6503 28.0 2072 4.0866 1.0 37.3501 0.1118
5.4901 29.0 2146 4.0389 1.0 37.5989 0.0751
5.342 30.0 2220 4.0093 1.0 37.1668 0.1192
5.2011 31.0 2294 3.8163 1.0 37.9941 0.1822
5.1137 32.0 2368 3.7344 1.0 38.1650 0.1568
5.0031 33.0 2442 3.8307 1.0 37.3691 0.0590
4.8702 34.0 2516 3.6911 1.0 37.8558 0.1390
4.8229 35.0 2590 3.7435 1.0 37.3599 0.1256
4.6841 36.0 2664 3.6212 1.0 38.0017 0.0937
4.607 37.0 2738 3.5849 1.0 37.0052 0.0807
4.4539 38.0 2812 3.6441 1.0 36.9756 0.0517
4.446 39.0 2886 3.5075 1.0 38.1396 0.0910
4.3089 40.0 2960 3.5159 1.0 36.8978 0.1166
4.2739 41.0 3034 3.3696 1.0 37.1386 0.1391
4.1601 42.0 3108 3.4174 1.0 38.0813 0.0957
4.0882 43.0 3182 3.3701 1.0 37.5271 0.1303
4.0148 44.0 3256 3.2427 1.0 37.6244 0.1631
3.9753 45.0 3330 3.2444 1.0 37.7280 0.1849
3.9001 46.0 3404 3.3593 1.0 37.2999 0.1094
3.8799 47.0 3478 3.2341 1.0 37.1080 0.1838
3.7997 48.0 3552 3.1679 1.0 37.5732 0.2302
3.7542 49.0 3626 3.1729 1.0 37.7116 0.1509
3.686 50.0 3700 3.1414 1.0 38.1692 0.2150

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/2ceaea9e502b528e9538edb1cb9746aa

Finetuned
(38)
this model