calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4221	1.0	6	2.7768
2.4653	2.0	12	2.0199
1.8737	3.0	18	1.7620
1.6824	4.0	24	1.6451
1.5913	5.0	30	1.5850
1.6037	6.0	36	1.5291
1.5423	7.0	42	1.5260
1.4843	8.0	48	1.4928
1.4615	9.0	54	1.5258
1.5032	10.0	60	1.4441
1.4306	11.0	66	1.4209
1.3806	12.0	72	1.3690
1.3774	13.0	78	1.3466
1.3600	14.0	84	1.3476
1.3149	15.0	90	1.2975
1.2855	16.0	96	1.2750
1.2583	17.0	102	1.2219
1.2044	18.0	108	1.1706
1.1580	19.0	114	1.1378
1.1164	20.0	120	1.0707
1.0685	21.0	126	1.0427
1.0650	22.0	132	1.1815
1.1092	23.0	138	1.1009
1.0326	24.0	144	0.9949
0.9945	25.0	150	0.9492
0.9681	26.0	156	0.9273
0.9450	27.0	162	0.9034
0.9205	28.0	168	0.9252
0.9230	29.0	174	0.8584
0.8786	30.0	180	0.8492
0.8562	31.0	186	0.8264
0.8503	32.0	192	0.8022
0.8221	33.0	198	0.7821
0.8062	34.0	204	0.7827
0.8090	35.0	210	0.7752
0.7897	36.0	216	0.7761
0.7863	37.0	222	0.7480
0.7916	38.0	228	0.7541
0.7776	39.0	234	0.7468
0.7688	40.0	240	0.7381

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support