chess-model-output
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.3561
- Accuracy: 0.4705
- Top5 Accuracy: 0.8328
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 256
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 0.05
- num_epochs: 8
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Accuracy | Validation Loss | Top5 Accuracy |
|---|---|---|---|---|---|
| 51.2238 | 0.0674 | 2000 | 0.0442 | 6.3411 | 0.1203 |
| 41.3822 | 0.1347 | 4000 | 0.0790 | 4.9888 | 0.2743 |
| 36.6208 | 0.2021 | 6000 | 0.0978 | 4.3841 | 0.3289 |
| 34.1621 | 0.2695 | 8000 | 0.1300 | 4.0865 | 0.3881 |
| 32.5538 | 0.3368 | 10000 | 0.1516 | 3.9081 | 0.4229 |
| 31.4295 | 0.4042 | 12000 | 0.173 | 3.7777 | 0.4589 |
| 30.6596 | 0.4716 | 14000 | 0.1900 | 3.6560 | 0.4899 |
| 29.9171 | 0.5389 | 16000 | 0.205 | 3.5772 | 0.5101 |
| 29.3951 | 0.6063 | 18000 | 0.2210 | 3.5026 | 0.5327 |
| 28.8304 | 0.6737 | 20000 | 0.2341 | 3.4387 | 0.5535 |
| 28.4445 | 0.7411 | 22000 | 0.2465 | 3.3696 | 0.5693 |
| 27.8682 | 0.8084 | 24000 | 0.2544 | 3.3363 | 0.5802 |
| 27.7150 | 0.8758 | 26000 | 0.2633 | 3.2787 | 0.5974 |
| 27.2758 | 0.9432 | 28000 | 0.2722 | 3.2303 | 0.6119 |
| 26.6353 | 1.0105 | 30000 | 0.2793 | 3.2009 | 0.6202 |
| 26.6178 | 1.0779 | 32000 | 0.2846 | 3.1678 | 0.6291 |
| 26.2986 | 1.1452 | 34000 | 0.2933 | 3.1411 | 0.6367 |
| 26.1541 | 1.2126 | 36000 | 0.2990 | 3.1127 | 0.638 |
| 25.9314 | 1.2800 | 38000 | 0.3054 | 3.0818 | 0.6509 |
| 25.7470 | 1.3474 | 40000 | 0.3060 | 3.0629 | 0.6562 |
| 25.5577 | 1.4147 | 42000 | 0.3150 | 3.0334 | 0.6666 |
| 25.3361 | 1.4821 | 44000 | 0.3235 | 3.0070 | 0.6729 |
| 25.2382 | 1.5495 | 46000 | 0.332 | 2.9805 | 0.6769 |
| 25.0611 | 1.6168 | 48000 | 0.3352 | 2.9525 | 0.6838 |
| 24.8286 | 1.6842 | 50000 | 0.3399 | 2.9375 | 0.6879 |
| 24.6659 | 1.7516 | 52000 | 0.3457 | 2.9056 | 0.6962 |
| 24.5298 | 1.8189 | 54000 | 0.3489 | 2.8906 | 0.7036 |
| 24.2796 | 1.8863 | 56000 | 0.3487 | 2.8683 | 0.7122 |
| 24.1826 | 1.9537 | 58000 | 0.3579 | 2.8424 | 0.7151 |
| 23.6416 | 2.0210 | 60000 | 0.3582 | 2.8303 | 0.7207 |
| 23.5601 | 2.0884 | 62000 | 0.3681 | 2.8129 | 0.7257 |
| 23.4484 | 2.1558 | 64000 | 0.3678 | 2.7928 | 0.7295 |
| 23.3427 | 2.2231 | 66000 | 0.3758 | 2.7726 | 0.7349 |
| 23.1908 | 2.2905 | 68000 | 0.3707 | 2.7648 | 0.739 |
| 23.2677 | 2.3579 | 70000 | 0.3759 | 2.7461 | 0.7433 |
| 23.0687 | 2.4252 | 72000 | 0.381 | 2.7335 | 0.7481 |
| 22.9007 | 2.4926 | 74000 | 0.3874 | 2.7178 | 0.7491 |
| 22.8318 | 2.5600 | 76000 | 0.3880 | 2.6960 | 0.7552 |
| 22.7372 | 2.6273 | 78000 | 0.3906 | 2.6794 | 0.7616 |
| 22.6547 | 2.6947 | 80000 | 0.3887 | 2.6676 | 0.7634 |
| 22.5251 | 2.7621 | 82000 | 0.3987 | 2.6484 | 0.7651 |
| 22.4387 | 2.8294 | 84000 | 0.4015 | 2.6389 | 0.7685 |
| 22.2422 | 2.8968 | 86000 | 0.4033 | 2.6314 | 0.7684 |
| 22.2428 | 2.9642 | 88000 | 0.4073 | 2.6147 | 0.7753 |
| 21.6002 | 3.0315 | 90000 | 0.4108 | 2.6059 | 0.7766 |
| 22.3475 | 3.0989 | 92000 | 2.6969 | 0.3895 | 0.7555 |
| 22.4535 | 3.1663 | 94000 | 2.6879 | 0.3946 | 0.7572 |
| 22.3754 | 3.2336 | 96000 | 2.6955 | 0.3931 | 0.7586 |
| 22.3138 | 3.3010 | 98000 | 2.6783 | 0.3934 | 0.756 |
| 22.3112 | 3.3684 | 100000 | 2.6752 | 0.3945 | 0.7605 |
| 22.2399 | 3.4357 | 102000 | 2.6553 | 0.3963 | 0.7692 |
| 22.1860 | 3.5031 | 104000 | 2.6482 | 0.4018 | 0.7688 |
| 22.1966 | 3.5705 | 106000 | 2.6313 | 0.3974 | 0.7714 |
| 22.1117 | 3.6378 | 108000 | 2.6443 | 0.3982 | 0.7692 |
| 22.0962 | 3.7052 | 110000 | 2.6165 | 0.406 | 0.7763 |
| 21.9232 | 3.7726 | 112000 | 2.6135 | 0.4081 | 0.776 |
| 21.9250 | 3.8399 | 114000 | 2.6073 | 0.4099 | 0.7832 |
| 21.8086 | 3.9073 | 116000 | 2.6034 | 0.4056 | 0.7791 |
| 21.8148 | 3.9747 | 118000 | 2.5902 | 0.4088 | 0.778 |
| 21.1899 | 4.0420 | 120000 | 2.5835 | 0.4101 | 0.7834 |
| 21.2872 | 4.1094 | 122000 | 2.5744 | 0.413 | 0.7842 |
| 21.2121 | 4.1768 | 124000 | 2.5634 | 0.4115 | 0.7883 |
| 21.1537 | 4.2441 | 126000 | 2.5567 | 0.4207 | 0.7898 |
| 21.1143 | 4.3115 | 128000 | 2.5476 | 0.4237 | 0.7928 |
| 21.1197 | 4.3789 | 130000 | 2.5450 | 0.4199 | 0.7907 |
| 21.0866 | 4.4462 | 132000 | 2.5373 | 0.4245 | 0.7927 |
| 21.0555 | 4.5136 | 134000 | 2.5237 | 0.4268 | 0.7954 |
| 20.9838 | 4.5810 | 136000 | 2.5196 | 0.4269 | 0.7989 |
| 20.9613 | 4.6484 | 138000 | 2.5114 | 0.4286 | 0.8029 |
| 20.8328 | 4.7157 | 140000 | 2.5009 | 0.4317 | 0.7982 |
| 20.8385 | 4.7831 | 142000 | 2.4929 | 0.4336 | 0.8071 |
| 20.7476 | 4.8505 | 144000 | 2.4884 | 0.4335 | 0.8049 |
| 20.7141 | 4.9178 | 146000 | 2.4841 | 0.4333 | 0.8073 |
| 20.6954 | 4.9852 | 148000 | 2.4720 | 0.4348 | 0.8089 |
| 20.1964 | 5.0525 | 150000 | 2.4757 | 0.4417 | 0.8084 |
| 20.1919 | 5.1199 | 152000 | 2.4668 | 0.4421 | 0.8105 |
| 20.2029 | 5.1873 | 154000 | 2.4591 | 0.4435 | 0.8104 |
| 20.1102 | 5.2547 | 156000 | 2.4589 | 0.4404 | 0.8128 |
| 20.0693 | 5.3220 | 158000 | 2.4492 | 0.4432 | 0.8167 |
| 20.1529 | 5.3894 | 160000 | 2.4448 | 0.4465 | 0.8172 |
| 20.0076 | 5.4568 | 162000 | 2.4388 | 0.4483 | 0.8172 |
| 20.0621 | 5.5241 | 164000 | 2.4324 | 0.4468 | 0.8187 |
| 20.0232 | 5.5915 | 166000 | 2.4288 | 0.4476 | 0.822 |
| 19.9641 | 5.6589 | 168000 | 2.4208 | 0.4516 | 0.8217 |
| 20.0099 | 5.7262 | 170000 | 2.4136 | 0.4518 | 0.8183 |
| 19.9398 | 5.7936 | 172000 | 2.4098 | 0.4502 | 0.8216 |
| 19.9230 | 5.8610 | 174000 | 2.4090 | 0.4525 | 0.8206 |
| 19.8980 | 5.9283 | 176000 | 2.4040 | 0.4551 | 0.8255 |
| 19.8680 | 5.9957 | 178000 | 2.4050 | 0.4561 | 0.8256 |
| 19.3353 | 6.0631 | 180000 | 2.3990 | 0.4552 | 0.8251 |
| 19.4134 | 6.1304 | 182000 | 2.3977 | 0.4575 | 0.8259 |
| 19.3965 | 6.1978 | 184000 | 2.3921 | 0.4562 | 0.8238 |
| 19.4687 | 6.2652 | 186000 | 2.3889 | 0.4543 | 0.8305 |
| 19.3202 | 6.3325 | 188000 | 2.3877 | 0.4579 | 0.8285 |
| 19.3963 | 6.3999 | 190000 | 2.3822 | 0.4603 | 0.828 |
| 19.2950 | 6.4673 | 192000 | 2.3776 | 0.4597 | 0.8312 |
| 19.2901 | 6.5346 | 194000 | 2.3750 | 0.4644 | 0.8288 |
| 19.2375 | 6.6020 | 196000 | 2.3715 | 0.4649 | 0.8306 |
| 19.3044 | 6.6694 | 198000 | 2.3713 | 0.4652 | 0.8321 |
| 19.3410 | 6.7367 | 200000 | 2.3715 | 0.4638 | 0.8314 |
| 19.3258 | 6.8041 | 202000 | 2.3674 | 0.4674 | 0.8314 |
| 19.2042 | 6.8715 | 204000 | 2.3629 | 0.4656 | 0.8326 |
| 19.2207 | 6.9388 | 206000 | 2.3630 | 0.4671 | 0.8334 |
| 18.9808 | 7.0062 | 208000 | 2.3619 | 0.4644 | 0.8322 |
| 19.0327 | 7.0736 | 210000 | 2.3632 | 0.4662 | 0.8336 |
| 18.9676 | 7.1409 | 212000 | 2.3636 | 0.4685 | 0.8312 |
| 18.9669 | 7.2083 | 214000 | 2.3614 | 0.4678 | 0.8323 |
| 19.0012 | 7.2757 | 216000 | 2.3578 | 0.4671 | 0.8338 |
| 18.9087 | 7.3430 | 218000 | 2.3599 | 0.469 | 0.8337 |
| 18.9079 | 7.4104 | 220000 | 2.3582 | 0.4691 | 0.8348 |
| 18.8884 | 7.4778 | 222000 | 2.3571 | 0.4692 | 0.832 |
| 18.9994 | 7.5451 | 224000 | 2.3570 | 0.4697 | 0.8333 |
| 19.0156 | 7.6125 | 226000 | 2.3567 | 0.4696 | 0.8336 |
| 18.9452 | 7.6799 | 228000 | 2.3561 | 0.4705 | 0.8328 |
| 18.9182 | 7.7473 | 230000 | 2.3554 | 0.4701 | 0.8338 |
| 19.0061 | 7.8146 | 232000 | 2.3554 | 0.4697 | 0.8336 |
| 18.9271 | 7.8820 | 234000 | 2.3553 | 0.4691 | 0.8338 |
| 18.9573 | 7.9494 | 236000 | 2.3553 | 0.4691 | 0.8338 |
Framework versions
- Transformers 5.2.0
- Pytorch 2.7.1+cu118
- Datasets 4.4.1
- Tokenizers 0.22.1
- Downloads last month
- 30