e14dfac9d43f01218619eac3fca7ca9c

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books [de-fr] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4196
  • Data Size: 1.0
  • Epoch Runtime: 205.8362
  • Bleu: 9.3789

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 2.3367 0 16.6961 4.0341
No log 1 872 2.2019 0.0078 17.1484 4.6135
No log 2 1744 2.1132 0.0156 18.7268 4.8180
0.0377 3 2616 2.0425 0.0312 21.9222 4.8406
0.1431 4 3488 1.9730 0.0625 28.4409 5.2560
2.1816 5 4360 1.9060 0.125 40.2414 5.1929
2.0629 6 5232 1.8339 0.25 62.2581 5.8377
1.9233 7 6104 1.7445 0.5 109.0056 6.3272
1.8188 8.0 6976 1.6562 1.0 203.7760 6.9404
1.7321 9.0 7848 1.6029 1.0 205.9968 7.5055
1.6903 10.0 8720 1.5617 1.0 215.5350 7.8362
1.626 11.0 9592 1.5304 1.0 211.5992 7.9350
1.5501 12.0 10464 1.5096 1.0 217.1961 8.2531
1.5287 13.0 11336 1.4904 1.0 216.2585 8.4558
1.4598 14.0 12208 1.4758 1.0 210.5595 8.5746
1.4347 15.0 13080 1.4656 1.0 203.0477 8.5464
1.4006 16.0 13952 1.4533 1.0 198.8969 8.9832
1.3355 17.0 14824 1.4463 1.0 206.0515 9.0535
1.3292 18.0 15696 1.4374 1.0 205.1314 8.9567
1.3087 19.0 16568 1.4313 1.0 212.4399 9.1311
1.2654 20.0 17440 1.4274 1.0 200.9783 9.2130
1.2788 21.0 18312 1.4221 1.0 202.8125 9.2160
1.241 22.0 19184 1.4193 1.0 208.9738 9.3522
1.1823 23.0 20056 1.4198 1.0 207.6572 9.2450
1.1673 24.0 20928 1.4205 1.0 202.4762 9.2619
1.1537 25.0 21800 1.4252 1.0 211.7916 9.3895
1.1538 26.0 22672 1.4196 1.0 205.8362 9.3789

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/e14dfac9d43f01218619eac3fca7ca9c

Finetuned
(729)
this model