calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4238	1.0	6	2.7523
2.3982	2.0	12	2.1138
1.8535	3.0	18	1.8031
1.6989	4.0	24	1.9360
1.5938	5.0	30	1.6749
1.5256	6.0	36	1.6441
1.4844	7.0	42	1.7053
1.4458	8.0	48	1.6526
1.4297	9.0	54	1.8445
1.3608	10.0	60	1.8894
1.2888	11.0	66	2.1697
1.2656	12.0	72	2.0470
1.2222	13.0	78	2.0777
1.1678	14.0	84	2.1063
1.1038	15.0	90	2.0946
1.0625	16.0	96	2.0211
1.0258	17.0	102	2.1410
1.0500	18.0	108	2.3152
1.0602	19.0	114	2.3428
1.0098	20.0	120	2.2889
1.0509	21.0	126	2.2918
0.9735	22.0	132	2.3569
0.8998	23.0	138	2.4122
0.8866	24.0	144	2.5343
0.8842	25.0	150	2.5096
0.8412	26.0	156	2.5483
0.8250	27.0	162	2.4582
0.8417	28.0	168	2.5274
0.7937	29.0	174	2.5061
0.7735	30.0	180	2.6067
0.7456	31.0	186	2.5956
0.7543	32.0	192	2.5598
0.7232	33.0	198	2.5147
0.7093	34.0	204	2.5534
0.7012	35.0	210	2.6147
0.6985	36.0	216	2.5596
0.6898	37.0	222	2.6099
0.6722	38.0	228	2.5622
0.6785	39.0	234	2.5618
0.6685	40.0	240	2.5764

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support