ea1838e5ab58a6875f95363307574c71
This model is a fine-tuned version of google/umt5-base on the Helsinki-NLP/opus_books [fr-nl] dataset. It achieves the following results on the evaluation set:
- Loss: 1.7244
- Data Size: 1.0
- Epoch Runtime: 227.0987
- Bleu: 11.3478
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.8501 | 0 | 18.7319 | 0.2531 |
| No log | 1 | 1000 | 11.4087 | 0.0078 | 20.4448 | 0.2484 |
| No log | 2 | 2000 | 10.4134 | 0.0156 | 22.6671 | 0.2688 |
| No log | 3 | 3000 | 9.1732 | 0.0312 | 26.5250 | 0.3036 |
| 0.4887 | 4 | 4000 | 6.6378 | 0.0625 | 34.0548 | 0.3509 |
| 5.6661 | 5 | 5000 | 3.6067 | 0.125 | 46.7000 | 5.2304 |
| 0.2394 | 6 | 6000 | 2.7806 | 0.25 | 72.6359 | 4.5796 |
| 0.2916 | 7 | 7000 | 2.4648 | 0.5 | 124.0546 | 6.0043 |
| 2.8366 | 8.0 | 8000 | 2.2451 | 1.0 | 228.4455 | 7.1941 |
| 2.6234 | 9.0 | 9000 | 2.1357 | 1.0 | 227.8906 | 7.8593 |
| 2.474 | 10.0 | 10000 | 2.0542 | 1.0 | 227.9562 | 8.3096 |
| 2.383 | 11.0 | 11000 | 1.9929 | 1.0 | 231.0268 | 8.7678 |
| 2.2796 | 12.0 | 12000 | 1.9503 | 1.0 | 228.0446 | 9.0394 |
| 2.2007 | 13.0 | 13000 | 1.9234 | 1.0 | 229.5092 | 9.3699 |
| 2.121 | 14.0 | 14000 | 1.8852 | 1.0 | 235.1929 | 9.6404 |
| 2.0367 | 15.0 | 15000 | 1.8570 | 1.0 | 235.1429 | 9.8445 |
| 2.017 | 16.0 | 16000 | 1.8366 | 1.0 | 232.0961 | 9.9998 |
| 1.9158 | 17.0 | 17000 | 1.8234 | 1.0 | 230.5636 | 10.1654 |
| 1.9261 | 18.0 | 18000 | 1.8024 | 1.0 | 232.6199 | 10.2223 |
| 1.8678 | 19.0 | 19000 | 1.7901 | 1.0 | 231.7953 | 10.3140 |
| 1.8087 | 20.0 | 20000 | 1.7833 | 1.0 | 231.0188 | 10.4957 |
| 1.7884 | 21.0 | 21000 | 1.7631 | 1.0 | 235.1509 | 10.6346 |
| 1.7503 | 22.0 | 22000 | 1.7554 | 1.0 | 232.4435 | 10.6599 |
| 1.7083 | 23.0 | 23000 | 1.7517 | 1.0 | 231.4336 | 10.7470 |
| 1.662 | 24.0 | 24000 | 1.7433 | 1.0 | 232.1989 | 10.7994 |
| 1.675 | 25.0 | 25000 | 1.7371 | 1.0 | 232.1738 | 10.9347 |
| 1.6014 | 26.0 | 26000 | 1.7306 | 1.0 | 229.3544 | 11.0011 |
| 1.5773 | 27.0 | 27000 | 1.7357 | 1.0 | 231.6688 | 11.0416 |
| 1.5321 | 28.0 | 28000 | 1.7332 | 1.0 | 230.5063 | 11.0622 |
| 1.5308 | 29.0 | 29000 | 1.7259 | 1.0 | 229.9135 | 11.1018 |
| 1.48 | 30.0 | 30000 | 1.7305 | 1.0 | 231.0126 | 11.0951 |
| 1.4994 | 31.0 | 31000 | 1.7169 | 1.0 | 229.2552 | 11.2487 |
| 1.4482 | 32.0 | 32000 | 1.7227 | 1.0 | 231.7054 | 11.1848 |
| 1.4089 | 33.0 | 33000 | 1.7237 | 1.0 | 228.0267 | 11.2684 |
| 1.3907 | 34.0 | 34000 | 1.7234 | 1.0 | 227.4852 | 11.2950 |
| 1.3779 | 35.0 | 35000 | 1.7244 | 1.0 | 227.0987 | 11.3478 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/ea1838e5ab58a6875f95363307574c71
Base model
google/umt5-base