char-text-reversal

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0816
  • Char Accuracy: 0.0065
  • Sequence Accuracy: 0.0
  • Edit Distance: 38.583

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 1000

Training results

Training Loss Epoch Step Validation Loss Char Accuracy Sequence Accuracy Edit Distance
4.1717 1.0 79 3.7332 0.0319 0.0 131.1735
3.4932 2.0 158 3.2892 0.0011 0.0 129.146
3.1822 3.0 237 3.0756 0.0 0.0 125.971
3.0081 4.0 316 2.9370 0.0 0.0 122.952
2.8946 5.0 395 2.8457 0.0 0.0 122.085
2.8162 6.0 474 2.7805 0.0000 0.0 121.204
2.7578 7.0 553 2.7284 0.0000 0.0 120.8485
2.7107 8.0 632 2.6850 0.0 0.0 120.5575
2.6695 9.0 711 2.6455 0.0000 0.0 120.3835
2.632 10.0 790 2.6074 0.0000 0.0 120.0615
2.5971 11.0 869 2.5695 0.0001 0.0 117.5055
2.5649 12.0 948 2.5360 0.0002 0.0 108.9205
2.5353 13.0 1027 2.5029 0.0003 0.0 95.6955
2.506 14.0 1106 2.4734 0.0004 0.0 85.586
2.4807 15.0 1185 2.4449 0.0005 0.0 77.844
2.455 16.0 1264 2.4136 0.0011 0.0 73.5575
2.4185 17.0 1343 2.3630 0.0013 0.0 68.8565
2.371 18.0 1422 2.2994 0.0024 0.0 64.448
2.3213 19.0 1501 2.2370 0.0027 0.0 62.501
2.2707 20.0 1580 2.1751 0.0039 0.0 59.419
2.227 21.0 1659 2.1183 0.0037 0.0 58.6405
2.182 22.0 1738 2.0610 0.0043 0.0 56.788
2.1396 23.0 1817 2.0002 0.0044 0.0 55.4195
2.0969 24.0 1896 1.9433 0.0046 0.0 54.239
2.0581 25.0 1975 1.8935 0.0046 0.0 52.833
2.025 26.0 2054 1.8459 0.0037 0.0 51.9935
1.9885 27.0 2133 1.7941 0.0043 0.0 50.6845
1.9587 28.0 2212 1.7568 0.0042 0.0 49.62
1.93 29.0 2291 1.7101 0.0047 0.0 48.6285
1.8983 30.0 2370 1.6641 0.0050 0.0 47.612
1.8693 31.0 2449 1.6341 0.0054 0.0 46.9725
1.8421 32.0 2528 1.5895 0.0049 0.0 46.026
1.8157 33.0 2607 1.5549 0.0057 0.0 45.169
1.7952 34.0 2686 1.5340 0.0058 0.0 44.602
1.7736 35.0 2765 1.4917 0.0065 0.0 43.823
1.7483 36.0 2844 1.4561 0.0055 0.0 43.098
1.7218 37.0 2923 1.4206 0.0071 0.0 42.265
1.6995 38.0 3002 1.3885 0.0065 0.0 41.419
1.6819 39.0 3081 1.3714 0.0057 0.0 41.078
1.6641 40.0 3160 1.3450 0.0066 0.0 40.324
1.6437 41.0 3239 1.3164 0.0053 0.0 39.8805
1.6198 42.0 3318 1.2894 0.0050 0.0 39.559
1.6045 43.0 3397 1.2686 0.0060 0.0 39.1475
1.5891 44.0 3476 1.2373 0.0069 0.0 38.3675
1.5774 45.0 3555 1.2252 0.0058 0.0 38.3125
1.5608 46.0 3634 1.2069 0.0056 0.0 38.1185
1.5488 47.0 3713 1.1713 0.0062 0.0 37.791
1.5265 48.0 3792 1.1443 0.0073 0.0 38.148
1.5095 49.0 3871 1.1223 0.0060 0.0 38.268
1.4939 50.0 3950 1.0998 0.0066 0.0 38.621
1.4799 51.0 4029 1.0816 0.0065 0.0 38.583

Framework versions

  • Transformers 4.55.4
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
32
Safetensors
Model size
279k params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support