calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.0109	1.0	6	2.2280
2.0109	2.0	12	1.7234
1.5557	3.0	18	1.3273
1.2094	4.0	24	1.0575
1.0379	5.0	30	0.9553
0.9086	6.0	36	0.7990
0.7845	7.0	42	0.7418
0.7194	8.0	48	0.6974
0.6941	9.0	54	0.6425
0.6287	10.0	60	0.5873
0.5727	11.0	66	0.5557
0.5554	12.0	72	0.5182
0.5188	13.0	78	0.4901
0.4958	14.0	84	0.4795
0.4803	15.0	90	0.4511
0.4574	16.0	96	0.4144
0.4339	17.0	102	0.4013
0.4121	18.0	108	0.3858
0.3928	19.0	114	0.3654
0.3788	20.0	120	0.3881
0.3848	21.0	126	0.3479
0.3481	22.0	132	0.3113
0.3370	23.0	138	0.2940
0.3078	24.0	144	0.2849
0.3043	25.0	150	0.2721
0.2817	26.0	156	0.2507
0.2699	27.0	162	0.2288
0.2426	28.0	168	0.2090
0.2266	29.0	174	0.1984
0.2146	30.0	180	0.1885
0.2025	31.0	186	0.1787
0.1990	32.0	192	0.1630
0.1891	33.0	198	0.1490
0.1778	34.0	204	0.1396
0.1613	35.0	210	0.1301
0.1577	36.0	216	0.1250
0.1538	37.0	222	0.1182
0.1483	38.0	228	0.1128
0.1423	39.0	234	0.1095
0.1467	40.0	240	0.1078

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support