calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4115	1.0	6	2.7525
2.3637	2.0	12	1.9883
1.8111	3.0	18	1.6460
1.5989	4.0	24	1.5795
1.5684	5.0	30	1.5667
1.5388	6.0	36	1.5580
1.5288	7.0	42	1.5458
1.5137	8.0	48	1.5235
1.5727	9.0	54	1.5418
1.5004	10.0	60	1.4956
1.4608	11.0	66	1.4476
1.3891	12.0	72	1.3548
1.3281	13.0	78	1.2845
1.2695	14.0	84	1.2130
1.2005	15.0	90	1.1400
1.1784	16.0	96	1.2321
1.1891	17.0	102	1.1240
1.1077	18.0	108	1.0586
1.0551	19.0	114	1.0325
1.0389	20.0	120	0.9592
0.9772	21.0	126	0.9272
0.9666	22.0	132	0.9543
0.9524	23.0	138	0.9173
0.9492	24.0	144	0.9572
0.9401	25.0	150	0.8581
0.8986	26.0	156	0.8526
0.8990	27.0	162	0.8333
0.8747	28.0	168	0.8174
0.8513	29.0	174	0.8259
0.8490	30.0	180	0.7877
0.8298	31.0	186	0.7663
0.8181	32.0	192	0.7542
0.8129	33.0	198	0.7607
0.8087	34.0	204	0.7328
0.7846	35.0	210	0.7273
0.7815	36.0	216	0.7166
0.7618	37.0	222	0.7100
0.7553	38.0	228	0.7040
0.7494	39.0	234	0.6990
0.7631	40.0	240	0.6976

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support