chess_gpt

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.0016

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.9) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 200
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
4.3945	0.0459	500	4.1863
3.5602	0.0919	1000	3.3568
3.2260	0.1378	1500	3.0412
3.0341	0.1838	2000	2.8599
2.9079	0.2297	2500	2.7319
2.8044	0.2757	3000	2.6385
2.7210	0.3216	3500	2.5691
2.6579	0.3675	4000	2.5138
2.6114	0.4135	4500	2.4692
2.5699	0.4594	5000	2.4287
2.5313	0.5054	5500	2.3933
2.4941	0.5513	6000	2.3640
2.4663	0.5973	6500	2.3397
2.4445	0.6432	7000	2.3147
2.4173	0.6891	7500	2.2942
2.3975	0.7351	8000	2.2726
2.3775	0.7810	8500	2.2570
2.3605	0.8270	9000	2.2396
2.3393	0.8729	9500	2.2246
2.3215	0.9189	10000	2.2117
2.3150	0.9648	10500	2.1971
2.2930	1.0108	11000	2.1871
2.2822	1.0567	11500	2.1762
2.2728	1.1026	12000	2.1648
2.2647	1.1486	12500	2.1549
2.2539	1.1945	13000	2.1468
2.2445	1.2405	13500	2.1370
2.2340	1.2864	14000	2.1298
2.2306	1.3324	14500	2.1213
2.2224	1.3783	15000	2.1132
2.2101	1.4242	15500	2.1075
2.2032	1.4702	16000	2.0996
2.1982	1.5161	16500	2.0931
2.1906	1.5621	17000	2.0864
2.1826	1.6080	17500	2.0818
2.1788	1.6540	18000	2.0751
2.1762	1.6999	18500	2.0690
2.1688	1.7458	19000	2.0656
2.1603	1.7918	19500	2.0588
2.1595	1.8377	20000	2.0548
2.1525	1.8837	20500	2.0499
2.1479	1.9296	21000	2.0457
2.1408	1.9756	21500	2.0417
2.1307	2.0215	22000	2.0373
2.1290	2.0674	22500	2.0350
2.1263	2.1134	23000	2.0307
2.1201	2.1593	23500	2.0276
2.1220	2.2053	24000	2.0246
2.1177	2.2512	24500	2.0216
2.1139	2.2972	25000	2.0191
2.1100	2.3431	25500	2.0163
2.1077	2.3890	26000	2.0144
2.1069	2.4350	26500	2.0127
2.1081	2.4809	27000	2.0100
2.1040	2.5269	27500	2.0084
2.1011	2.5728	28000	2.0070
2.1002	2.6188	28500	2.0059
2.0984	2.6647	29000	2.0049
2.0979	2.7106	29500	2.0037
2.0999	2.7566	30000	2.0029
2.0991	2.8025	30500	2.0024
2.0912	2.8485	31000	2.0020
2.0968	2.8944	31500	2.0018
2.0909	2.9404	32000	2.0016
2.0971	2.9863	32500	2.0016
2.0955	3.0	32649	2.0016

Framework versions

Transformers 5.12.1
Pytorch 2.10.0+cu128
Datasets 5.0.0
Tokenizers 0.22.2

Downloads last month: 558

Safetensors

Model size

11.9M params

Tensor type

F32