c72031b135eef954fc0610e6d8bc219e
This model is a fine-tuned version of google/umt5-small on the Helsinki-NLP/opus_books [en-ru] dataset. It achieves the following results on the evaluation set:
- Loss: 2.1574
- Data Size: 1.0
- Epoch Runtime: 71.1753
- Bleu: 7.5301
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 16.9728 | 0 | 6.3246 | 0.2711 |
| No log | 1 | 437 | 16.2460 | 0.0078 | 6.9613 | 0.3604 |
| No log | 2 | 874 | 15.0040 | 0.0156 | 7.7570 | 0.4340 |
| No log | 3 | 1311 | 13.1679 | 0.0312 | 9.0722 | 0.4471 |
| No log | 4 | 1748 | 10.7075 | 0.0625 | 10.9963 | 0.5078 |
| 11.9865 | 5 | 2185 | 7.3649 | 0.125 | 15.0259 | 0.6189 |
| 8.482 | 6 | 2622 | 4.9792 | 0.25 | 22.9080 | 1.0091 |
| 5.5888 | 7 | 3059 | 3.8803 | 0.5 | 37.6092 | 3.9858 |
| 4.4134 | 8.0 | 3496 | 3.1619 | 1.0 | 69.0978 | 3.0325 |
| 3.9851 | 9.0 | 3933 | 2.8942 | 1.0 | 69.1926 | 3.8939 |
| 3.701 | 10.0 | 4370 | 2.7730 | 1.0 | 68.8371 | 4.2735 |
| 3.5848 | 11.0 | 4807 | 2.6992 | 1.0 | 68.6402 | 4.6192 |
| 3.4352 | 12.0 | 5244 | 2.6427 | 1.0 | 69.9948 | 4.8753 |
| 3.3044 | 13.0 | 5681 | 2.5969 | 1.0 | 69.2732 | 5.0178 |
| 3.2305 | 14.0 | 6118 | 2.5579 | 1.0 | 70.2969 | 5.1887 |
| 3.1587 | 15.0 | 6555 | 2.5228 | 1.0 | 69.0757 | 5.3367 |
| 3.0715 | 16.0 | 6992 | 2.4845 | 1.0 | 70.2242 | 5.5283 |
| 3.0601 | 17.0 | 7429 | 2.4683 | 1.0 | 69.8976 | 5.6228 |
| 2.9926 | 18.0 | 7866 | 2.4341 | 1.0 | 69.7541 | 5.8032 |
| 2.9095 | 19.0 | 8303 | 2.4180 | 1.0 | 69.9274 | 5.9151 |
| 2.9047 | 20.0 | 8740 | 2.4011 | 1.0 | 70.2386 | 5.9841 |
| 2.8405 | 21.0 | 9177 | 2.3805 | 1.0 | 69.6005 | 6.0747 |
| 2.7902 | 22.0 | 9614 | 2.3596 | 1.0 | 70.0786 | 6.1834 |
| 2.7556 | 23.0 | 10051 | 2.3500 | 1.0 | 70.0480 | 6.2457 |
| 2.7298 | 24.0 | 10488 | 2.3324 | 1.0 | 69.6931 | 6.3249 |
| 2.7325 | 25.0 | 10925 | 2.3186 | 1.0 | 72.4412 | 6.4368 |
| 2.689 | 26.0 | 11362 | 2.3087 | 1.0 | 69.4702 | 6.4923 |
| 2.6729 | 27.0 | 11799 | 2.3005 | 1.0 | 70.1461 | 6.5329 |
| 2.5953 | 28.0 | 12236 | 2.2843 | 1.0 | 70.2789 | 6.6074 |
| 2.5617 | 29.0 | 12673 | 2.2735 | 1.0 | 70.2161 | 6.6736 |
| 2.531 | 30.0 | 13110 | 2.2674 | 1.0 | 69.6573 | 6.6912 |
| 2.5226 | 31.0 | 13547 | 2.2657 | 1.0 | 70.6520 | 6.7476 |
| 2.5702 | 32.0 | 13984 | 2.2580 | 1.0 | 70.1485 | 6.8362 |
| 2.5027 | 33.0 | 14421 | 2.2420 | 1.0 | 70.7487 | 6.8910 |
| 2.483 | 34.0 | 14858 | 2.2354 | 1.0 | 70.1335 | 6.8995 |
| 2.4384 | 35.0 | 15295 | 2.2254 | 1.0 | 70.7653 | 7.0147 |
| 2.416 | 36.0 | 15732 | 2.2219 | 1.0 | 70.0677 | 7.0343 |
| 2.3831 | 37.0 | 16169 | 2.2098 | 1.0 | 70.6963 | 7.0994 |
| 2.3674 | 38.0 | 16606 | 2.2080 | 1.0 | 70.6031 | 7.0817 |
| 2.3575 | 39.0 | 17043 | 2.2006 | 1.0 | 70.4574 | 7.1965 |
| 2.3313 | 40.0 | 17480 | 2.1937 | 1.0 | 70.7618 | 7.1884 |
| 2.3405 | 41.0 | 17917 | 2.1935 | 1.0 | 69.9549 | 7.2435 |
| 2.2925 | 42.0 | 18354 | 2.1881 | 1.0 | 70.2700 | 7.2781 |
| 2.298 | 43.0 | 18791 | 2.1820 | 1.0 | 69.4775 | 7.3183 |
| 2.2471 | 44.0 | 19228 | 2.1749 | 1.0 | 70.4325 | 7.3587 |
| 2.2876 | 45.0 | 19665 | 2.1698 | 1.0 | 70.3586 | 7.4095 |
| 2.2149 | 46.0 | 20102 | 2.1743 | 1.0 | 70.6157 | 7.4523 |
| 2.2064 | 47.0 | 20539 | 2.1638 | 1.0 | 69.7505 | 7.4806 |
| 2.1987 | 48.0 | 20976 | 2.1674 | 1.0 | 70.9113 | 7.4668 |
| 2.1447 | 49.0 | 21413 | 2.1589 | 1.0 | 70.1095 | 7.5387 |
| 2.1704 | 50.0 | 21850 | 2.1574 | 1.0 | 71.1753 | 7.5301 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/c72031b135eef954fc0610e6d8bc219e
Base model
google/umt5-small