calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.1638	1.0	5	2.4618
2.2285	2.0	10	1.8851
1.7471	3.0	15	1.5272
1.4118	4.0	20	1.2371
1.1468	5.0	25	1.0431
1.0124	6.0	30	0.9316
0.8781	7.0	35	0.8220
0.8087	8.0	40	0.7330
0.7384	9.0	45	0.6584
0.6706	10.0	50	0.6123
0.6128	11.0	55	0.5564
0.5746	12.0	60	0.5312
0.5371	13.0	65	0.5127
0.5085	14.0	70	0.4835
0.4708	15.0	75	0.4266
0.4362	16.0	80	0.4025
0.4124	17.0	85	0.3646
0.3872	18.0	90	0.3594
0.3697	19.0	95	0.3283
0.3443	20.0	100	0.3075
0.3247	21.0	105	0.2869
0.3094	22.0	110	0.2708
0.2902	23.0	115	0.2512
0.2797	24.0	120	0.2337
0.2680	25.0	125	0.2293
0.2516	26.0	130	0.2064
0.2402	27.0	135	0.1935
0.2260	28.0	140	0.1881
0.2199	29.0	145	0.1767
0.2101	30.0	150	0.1734
0.2012	31.0	155	0.1633
0.1947	32.0	160	0.1587
0.1866	33.0	165	0.1537
0.1812	34.0	170	0.1502
0.1776	35.0	175	0.1444
0.1723	36.0	180	0.1400
0.1696	37.0	185	0.1371
0.1665	38.0	190	0.1366
0.1636	39.0	195	0.1355
0.1650	40.0	200	0.1344

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support