0e8ce411bdb8af1151d05312cfd14286

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [fi-pl] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2659
  • Data Size: 1.0
  • Epoch Runtime: 22.6192
  • Bleu: 0.5000

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 3.7828 0 2.0980 0.0648
No log 1 70 3.6667 0.0078 2.3552 0.0625
No log 2 140 3.4363 0.0156 2.9979 0.0473
No log 3 210 3.3102 0.0312 3.3752 0.0606
No log 4 280 3.1056 0.0625 3.9195 0.1106
No log 5 350 2.9603 0.125 5.1352 0.1619
No log 6 420 2.8461 0.25 7.4230 0.2130
0.4877 7 490 2.7202 0.5 11.7368 0.1086
2.8953 8.0 560 2.6146 1.0 19.1842 0.1549
2.8149 9.0 630 2.5536 1.0 18.4865 0.1561
2.7168 10.0 700 2.5105 1.0 18.1600 0.2418
2.6733 11.0 770 2.4728 1.0 18.9439 0.2245
2.6435 12.0 840 2.4363 1.0 17.7156 0.2762
2.5711 13.0 910 2.4144 1.0 18.0877 0.2782
2.5489 14.0 980 2.3951 1.0 18.9879 0.2732
2.5005 15.0 1050 2.3725 1.0 19.8824 0.3534
2.4655 16.0 1120 2.3559 1.0 19.1626 0.3408
2.4327 17.0 1190 2.3393 1.0 23.7882 0.3929
2.4006 18.0 1260 2.3283 1.0 18.2179 0.3648
2.3874 19.0 1330 2.3206 1.0 18.5401 0.3680
2.3523 20.0 1400 2.3182 1.0 19.6670 0.3782
2.3153 21.0 1470 2.3023 1.0 19.0961 0.4020
2.3201 22.0 1540 2.2951 1.0 23.1245 0.4138
2.2745 23.0 1610 2.2863 1.0 21.4893 0.3832
2.2601 24.0 1680 2.2843 1.0 20.8304 0.3954
2.2317 25.0 1750 2.2787 1.0 20.6040 0.4217
2.205 26.0 1820 2.2704 1.0 18.7773 0.4089
2.2114 27.0 1890 2.2668 1.0 17.9859 0.3746
2.1741 28.0 1960 2.2660 1.0 18.2620 0.3602
2.1549 29.0 2030 2.2597 1.0 21.2368 0.4359
2.1288 30.0 2100 2.2599 1.0 18.8151 0.4055
2.1125 31.0 2170 2.2589 1.0 19.2003 0.4461
2.1311 32.0 2240 2.2618 1.0 20.1962 0.4371
2.0864 33.0 2310 2.2626 1.0 19.6470 0.4133
2.0582 34.0 2380 2.2598 1.0 19.9388 0.4607
2.0529 35.0 2450 2.2564 1.0 19.9569 0.4630
2.0324 36.0 2520 2.2644 1.0 19.5040 0.4655
2.0196 37.0 2590 2.2605 1.0 19.8486 0.4757
2.0079 38.0 2660 2.2615 1.0 18.6603 0.4800
1.9872 39.0 2730 2.2537 1.0 18.4687 0.4825
1.9719 40.0 2800 2.2624 1.0 25.1243 0.5026
1.9544 41.0 2870 2.2596 1.0 21.2633 0.4623
1.956 42.0 2940 2.2690 1.0 20.1632 0.4609
1.9242 43.0 3010 2.2659 1.0 22.6192 0.5000

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/0e8ce411bdb8af1151d05312cfd14286

Finetuned
(729)
this model