chess-model-output

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3561
  • Accuracy: 0.4705
  • Top5 Accuracy: 0.8328

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.05
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Accuracy Validation Loss Top5 Accuracy
51.2238 0.0674 2000 0.0442 6.3411 0.1203
41.3822 0.1347 4000 0.0790 4.9888 0.2743
36.6208 0.2021 6000 0.0978 4.3841 0.3289
34.1621 0.2695 8000 0.1300 4.0865 0.3881
32.5538 0.3368 10000 0.1516 3.9081 0.4229
31.4295 0.4042 12000 0.173 3.7777 0.4589
30.6596 0.4716 14000 0.1900 3.6560 0.4899
29.9171 0.5389 16000 0.205 3.5772 0.5101
29.3951 0.6063 18000 0.2210 3.5026 0.5327
28.8304 0.6737 20000 0.2341 3.4387 0.5535
28.4445 0.7411 22000 0.2465 3.3696 0.5693
27.8682 0.8084 24000 0.2544 3.3363 0.5802
27.7150 0.8758 26000 0.2633 3.2787 0.5974
27.2758 0.9432 28000 0.2722 3.2303 0.6119
26.6353 1.0105 30000 0.2793 3.2009 0.6202
26.6178 1.0779 32000 0.2846 3.1678 0.6291
26.2986 1.1452 34000 0.2933 3.1411 0.6367
26.1541 1.2126 36000 0.2990 3.1127 0.638
25.9314 1.2800 38000 0.3054 3.0818 0.6509
25.7470 1.3474 40000 0.3060 3.0629 0.6562
25.5577 1.4147 42000 0.3150 3.0334 0.6666
25.3361 1.4821 44000 0.3235 3.0070 0.6729
25.2382 1.5495 46000 0.332 2.9805 0.6769
25.0611 1.6168 48000 0.3352 2.9525 0.6838
24.8286 1.6842 50000 0.3399 2.9375 0.6879
24.6659 1.7516 52000 0.3457 2.9056 0.6962
24.5298 1.8189 54000 0.3489 2.8906 0.7036
24.2796 1.8863 56000 0.3487 2.8683 0.7122
24.1826 1.9537 58000 0.3579 2.8424 0.7151
23.6416 2.0210 60000 0.3582 2.8303 0.7207
23.5601 2.0884 62000 0.3681 2.8129 0.7257
23.4484 2.1558 64000 0.3678 2.7928 0.7295
23.3427 2.2231 66000 0.3758 2.7726 0.7349
23.1908 2.2905 68000 0.3707 2.7648 0.739
23.2677 2.3579 70000 0.3759 2.7461 0.7433
23.0687 2.4252 72000 0.381 2.7335 0.7481
22.9007 2.4926 74000 0.3874 2.7178 0.7491
22.8318 2.5600 76000 0.3880 2.6960 0.7552
22.7372 2.6273 78000 0.3906 2.6794 0.7616
22.6547 2.6947 80000 0.3887 2.6676 0.7634
22.5251 2.7621 82000 0.3987 2.6484 0.7651
22.4387 2.8294 84000 0.4015 2.6389 0.7685
22.2422 2.8968 86000 0.4033 2.6314 0.7684
22.2428 2.9642 88000 0.4073 2.6147 0.7753
21.6002 3.0315 90000 0.4108 2.6059 0.7766
22.3475 3.0989 92000 2.6969 0.3895 0.7555
22.4535 3.1663 94000 2.6879 0.3946 0.7572
22.3754 3.2336 96000 2.6955 0.3931 0.7586
22.3138 3.3010 98000 2.6783 0.3934 0.756
22.3112 3.3684 100000 2.6752 0.3945 0.7605
22.2399 3.4357 102000 2.6553 0.3963 0.7692
22.1860 3.5031 104000 2.6482 0.4018 0.7688
22.1966 3.5705 106000 2.6313 0.3974 0.7714
22.1117 3.6378 108000 2.6443 0.3982 0.7692
22.0962 3.7052 110000 2.6165 0.406 0.7763
21.9232 3.7726 112000 2.6135 0.4081 0.776
21.9250 3.8399 114000 2.6073 0.4099 0.7832
21.8086 3.9073 116000 2.6034 0.4056 0.7791
21.8148 3.9747 118000 2.5902 0.4088 0.778
21.1899 4.0420 120000 2.5835 0.4101 0.7834
21.2872 4.1094 122000 2.5744 0.413 0.7842
21.2121 4.1768 124000 2.5634 0.4115 0.7883
21.1537 4.2441 126000 2.5567 0.4207 0.7898
21.1143 4.3115 128000 2.5476 0.4237 0.7928
21.1197 4.3789 130000 2.5450 0.4199 0.7907
21.0866 4.4462 132000 2.5373 0.4245 0.7927
21.0555 4.5136 134000 2.5237 0.4268 0.7954
20.9838 4.5810 136000 2.5196 0.4269 0.7989
20.9613 4.6484 138000 2.5114 0.4286 0.8029
20.8328 4.7157 140000 2.5009 0.4317 0.7982
20.8385 4.7831 142000 2.4929 0.4336 0.8071
20.7476 4.8505 144000 2.4884 0.4335 0.8049
20.7141 4.9178 146000 2.4841 0.4333 0.8073
20.6954 4.9852 148000 2.4720 0.4348 0.8089
20.1964 5.0525 150000 2.4757 0.4417 0.8084
20.1919 5.1199 152000 2.4668 0.4421 0.8105
20.2029 5.1873 154000 2.4591 0.4435 0.8104
20.1102 5.2547 156000 2.4589 0.4404 0.8128
20.0693 5.3220 158000 2.4492 0.4432 0.8167
20.1529 5.3894 160000 2.4448 0.4465 0.8172
20.0076 5.4568 162000 2.4388 0.4483 0.8172
20.0621 5.5241 164000 2.4324 0.4468 0.8187
20.0232 5.5915 166000 2.4288 0.4476 0.822
19.9641 5.6589 168000 2.4208 0.4516 0.8217
20.0099 5.7262 170000 2.4136 0.4518 0.8183
19.9398 5.7936 172000 2.4098 0.4502 0.8216
19.9230 5.8610 174000 2.4090 0.4525 0.8206
19.8980 5.9283 176000 2.4040 0.4551 0.8255
19.8680 5.9957 178000 2.4050 0.4561 0.8256
19.3353 6.0631 180000 2.3990 0.4552 0.8251
19.4134 6.1304 182000 2.3977 0.4575 0.8259
19.3965 6.1978 184000 2.3921 0.4562 0.8238
19.4687 6.2652 186000 2.3889 0.4543 0.8305
19.3202 6.3325 188000 2.3877 0.4579 0.8285
19.3963 6.3999 190000 2.3822 0.4603 0.828
19.2950 6.4673 192000 2.3776 0.4597 0.8312
19.2901 6.5346 194000 2.3750 0.4644 0.8288
19.2375 6.6020 196000 2.3715 0.4649 0.8306
19.3044 6.6694 198000 2.3713 0.4652 0.8321
19.3410 6.7367 200000 2.3715 0.4638 0.8314
19.3258 6.8041 202000 2.3674 0.4674 0.8314
19.2042 6.8715 204000 2.3629 0.4656 0.8326
19.2207 6.9388 206000 2.3630 0.4671 0.8334
18.9808 7.0062 208000 2.3619 0.4644 0.8322
19.0327 7.0736 210000 2.3632 0.4662 0.8336
18.9676 7.1409 212000 2.3636 0.4685 0.8312
18.9669 7.2083 214000 2.3614 0.4678 0.8323
19.0012 7.2757 216000 2.3578 0.4671 0.8338
18.9087 7.3430 218000 2.3599 0.469 0.8337
18.9079 7.4104 220000 2.3582 0.4691 0.8348
18.8884 7.4778 222000 2.3571 0.4692 0.832
18.9994 7.5451 224000 2.3570 0.4697 0.8333
19.0156 7.6125 226000 2.3567 0.4696 0.8336
18.9452 7.6799 228000 2.3561 0.4705 0.8328
18.9182 7.7473 230000 2.3554 0.4701 0.8338
19.0061 7.8146 232000 2.3554 0.4697 0.8336
18.9271 7.8820 234000 2.3553 0.4691 0.8338
18.9573 7.9494 236000 2.3553 0.4691 0.8338

Framework versions

  • Transformers 5.2.0
  • Pytorch 2.7.1+cu118
  • Datasets 4.4.1
  • Tokenizers 0.22.1
Downloads last month
30
Safetensors
Model size
11.3M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for benozen/chess-model-output

Finetunes
1 model