7132eac5f8a3d007b76b88db764976c7

This model is a fine-tuned version of google/mt5-base on the Helsinki-NLP/opus_books [de-nl] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0901
  • Data Size: 1.0
  • Epoch Runtime: 81.3317
  • Bleu: 7.8459

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 15.2536 0 7.2985 0.0211
No log 1 390 15.6662 0.0078 8.9328 0.0211
No log 2 780 13.8073 0.0156 8.9606 0.0190
No log 3 1170 13.4585 0.0312 11.2179 0.0221
No log 4 1560 11.7099 0.0625 14.1549 0.0137
0.7843 5 1950 9.2046 0.125 20.1830 0.0089
1.4287 6 2340 4.9232 0.25 29.3691 0.0543
4.054 7 2730 2.9184 0.5 46.5163 2.9863
3.1519 8.0 3120 2.5208 1.0 85.5254 4.8923
2.9129 9.0 3510 2.4149 1.0 86.1225 5.4005
2.7849 10.0 3900 2.3566 1.0 83.9788 5.6694
2.6958 11.0 4290 2.3041 1.0 82.6600 5.9535
2.5942 12.0 4680 2.2601 1.0 83.0294 6.2272
2.5384 13.0 5070 2.2363 1.0 82.6532 6.3636
2.4667 14.0 5460 2.2089 1.0 82.2912 6.5239
2.3524 15.0 5850 2.1965 1.0 82.5349 6.6398
2.3166 16.0 6240 2.1625 1.0 81.8239 6.7495
2.2953 17.0 6630 2.1555 1.0 81.9901 6.8783
2.2332 18.0 7020 2.1440 1.0 83.2302 7.0288
2.1893 19.0 7410 2.1239 1.0 83.5916 7.1203
2.1559 20.0 7800 2.1173 1.0 81.5685 7.1722
2.1161 21.0 8190 2.1141 1.0 81.6963 7.3281
2.043 22.0 8580 2.1068 1.0 82.8431 7.4121
2.0294 23.0 8970 2.1009 1.0 81.9595 7.4750
1.982 24.0 9360 2.0951 1.0 80.4355 7.5181
1.9439 25.0 9750 2.0982 1.0 81.8319 7.5364
1.9049 26.0 10140 2.1002 1.0 82.3029 7.6096
1.8619 27.0 10530 2.0860 1.0 81.8496 7.6852
1.8604 28.0 10920 2.0850 1.0 81.8561 7.7166
1.8383 29.0 11310 2.0821 1.0 81.0283 7.7468
1.796 30.0 11700 2.0823 1.0 82.0984 7.7893
1.7357 31.0 12090 2.0848 1.0 81.7833 7.8345
1.707 32.0 12480 2.0905 1.0 82.0953 7.8734
1.7016 33.0 12870 2.0901 1.0 81.3317 7.8459

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/7132eac5f8a3d007b76b88db764976c7

Base model

google/mt5-base
Finetuned
(301)
this model