calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.0133	1.0	6	2.2590
2.008	2.0	12	1.7059
1.548	3.0	18	1.4082
1.2527	4.0	24	1.0746
1.0272	5.0	30	0.9517
0.8724	6.0	36	0.8435
0.7908	7.0	42	0.7824
0.7088	8.0	48	0.7619
0.6489	9.0	54	0.7522
0.596	10.0	60	0.7539
0.5572	11.0	66	0.7119
0.5232	12.0	72	0.7161
0.4798	13.0	78	0.7047
0.4341	14.0	84	0.7528
0.4263	15.0	90	0.6681
0.3977	16.0	96	0.7356
0.3748	17.0	102	0.7377
0.3609	18.0	108	0.7455
0.3287	19.0	114	0.7199
0.3114	20.0	120	0.7666
0.2855	21.0	126	0.6954
0.2664	22.0	132	0.7641
0.258	23.0	138	0.7126
0.2414	24.0	144	0.7327
0.2281	25.0	150	0.6886
0.2108	26.0	156	0.6910
0.1933	27.0	162	0.7081
0.1907	28.0	168	0.7257
0.1827	29.0	174	0.7252
0.167	30.0	180	0.7151
0.1562	31.0	186	0.7102
0.1473	32.0	192	0.7296
0.149	33.0	198	0.6922
0.1413	34.0	204	0.7064
0.1265	35.0	210	0.7110
0.1235	36.0	216	0.7212
0.1156	37.0	222	0.7290
0.1141	38.0	228	0.7200
0.1099	39.0	234	0.7263
0.1071	40.0	240	0.7290

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support