calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.9372	1.0	6	2.2670
2.0312	2.0	12	1.7752
1.5891	3.0	18	1.3654
1.2485	4.0	24	1.1107
1.0668	5.0	30	1.0103
0.9308	6.0	36	0.8701
0.8106	7.0	42	0.7530
0.7096	8.0	48	0.7189
0.6817	9.0	54	0.6854
0.6489	10.0	60	0.6492
0.6130	11.0	66	0.5677
0.5748	12.0	72	0.5398
0.5358	13.0	78	0.5457
0.5137	14.0	84	0.5096
0.4751	15.0	90	0.4752
0.4416	16.0	96	0.4274
0.4043	17.0	102	0.3834
0.3843	18.0	108	0.3572
0.3639	19.0	114	0.3411
0.3399	20.0	120	0.3274
0.3107	21.0	126	0.2839
0.2773	22.0	132	0.2631
0.2583	23.0	138	0.2441
0.2466	24.0	144	0.2110
0.2229	25.0	150	0.1905
0.2093	26.0	156	0.1869
0.2033	27.0	162	0.1731
0.1819	28.0	168	0.1634
0.1751	29.0	174	0.1435
0.1590	30.0	180	0.1256
0.1545	31.0	186	0.1279
0.1409	32.0	192	0.1184
0.1319	33.0	198	0.1073
0.1222	34.0	204	0.1036
0.1270	35.0	210	0.0979
0.1177	36.0	216	0.0943
0.1176	37.0	222	0.0900
0.1087	38.0	228	0.0880
0.1039	39.0	234	0.0862
0.1047	40.0	240	0.0854

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support