calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.3893	1.0	6	2.7878
2.3793	2.0	12	2.0120
1.8105	3.0	18	1.7359
1.7065	4.0	24	1.6122
1.5999	5.0	30	1.5879
1.5255	6.0	36	1.5673
1.5594	7.0	42	1.5521
1.6001	8.0	48	1.5521
1.5258	9.0	54	1.5455
1.5409	10.0	60	1.5365
1.4994	11.0	66	1.5200
1.6964	12.0	72	1.5414
1.5533	13.0	78	1.5358
1.5126	14.0	84	1.5451
1.5059	15.0	90	1.5400
1.5275	16.0	96	1.5327
1.5096	17.0	102	1.5364
1.4895	18.0	108	1.5174
1.5105	19.0	114	1.5081
1.6200	20.0	120	1.4852
1.4444	21.0	126	1.4887
1.4563	22.0	132	1.4678
1.4463	23.0	138	1.5232
1.4361	24.0	144	1.4566
1.3858	25.0	150	1.4188
1.4205	26.0	156	1.3848
1.5212	27.0	162	1.3525
1.3287	28.0	168	1.3048
1.3173	29.0	174	1.2737
1.2489	30.0	180	1.2401
1.2379	31.0	186	1.2125
1.1931	32.0	192	1.1883
1.1832	33.0	198	1.1493
1.1847	34.0	204	1.1439
1.1620	35.0	210	1.1168
1.1313	36.0	216	1.1089
1.1060	37.0	222	1.0931
1.1073	38.0	228	1.0830
1.0929	39.0	234	1.0711
1.0735	40.0	240	1.0688

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support