calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0045

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 200

Training results

Training Loss	Epoch	Step	Validation Loss
2.3208	1.0	6	1.7801
1.6167	2.0	12	1.4190
1.3790	3.0	18	1.3787
1.2999	4.0	24	1.2406
1.1367	5.0	30	1.1432
1.0477	6.0	36	0.9924
0.9550	7.0	42	0.9145
0.8568	8.0	48	0.8587
0.8161	9.0	54	0.8325
0.8154	10.0	60	0.8123
0.7740	11.0	66	0.8280
0.7357	12.0	72	0.7068
0.6664	13.0	78	0.6546
0.6176	14.0	84	0.6074
0.5990	15.0	90	0.6042
0.5777	16.0	96	0.6093
0.5371	17.0	102	0.5074
0.4953	18.0	108	0.5070
0.4999	19.0	114	0.5536
0.5230	20.0	120	0.5276
0.4750	21.0	126	0.5726
0.5112	22.0	132	0.4879
0.4367	23.0	138	0.4797
0.4526	24.0	144	0.4519
0.4212	25.0	150	0.3937
0.4345	26.0	156	0.4556
0.4463	27.0	162	0.4319
0.4206	28.0	168	0.4294
0.4098	29.0	174	0.4353
0.4184	30.0	180	0.3689
0.3558	31.0	186	0.3611
0.3784	32.0	192	0.3562
0.3416	33.0	198	0.3633
0.3587	34.0	204	0.2998
0.3020	35.0	210	0.2746
0.2803	36.0	216	0.2586
0.2968	37.0	222	0.2734
0.2725	38.0	228	0.3669
0.3261	39.0	234	0.2672
0.2693	40.0	240	0.2603
0.3001	41.0	246	0.2625
0.2979	42.0	252	0.2724
0.2688	43.0	258	0.2563
0.2705	44.0	264	0.2068
0.2271	45.0	270	0.1919
0.2181	46.0	276	0.2369
0.2450	47.0	282	0.2518
0.2451	48.0	288	0.2630
0.3311	49.0	294	0.1948
0.2112	50.0	300	0.2220
0.2408	51.0	306	0.2290
0.2484	52.0	312	0.2001
0.2117	53.0	318	0.2169
0.2254	54.0	324	0.1979
0.2088	55.0	330	0.1925
0.2027	56.0	336	0.1754
0.1910	57.0	342	0.1389
0.1745	58.0	348	0.1300
0.1657	59.0	354	0.1269
0.1711	60.0	360	0.1430
0.1848	61.0	366	0.1232
0.1501	62.0	372	0.1050
0.1295	63.0	378	0.0914
0.1303	64.0	384	0.0986
0.1140	65.0	390	0.0799
0.1123	66.0	396	0.0995
0.1164	67.0	402	0.0903
0.1110	68.0	408	0.0899
0.1117	69.0	414	0.0973
0.1011	70.0	420	0.1054
0.1133	71.0	426	0.0858
0.0953	72.0	432	0.1040
0.1117	73.0	438	0.1025
0.1328	74.0	444	0.0957
0.1041	75.0	450	0.0897
0.1003	76.0	456	0.0712
0.0894	77.0	462	0.0774
0.0899	78.0	468	0.0680
0.0873	79.0	474	0.0749
0.1007	80.0	480	0.0679
0.0873	81.0	486	0.0692
0.0876	82.0	492	0.0897
0.1011	83.0	498	0.0743
0.0887	84.0	504	0.0714
0.0938	85.0	510	0.0623
0.0806	86.0	516	0.0704
0.0914	87.0	522	0.0540
0.0665	88.0	528	0.0421
0.0664	89.0	534	0.0429
0.0671	90.0	540	0.0366
0.0520	91.0	546	0.0301
0.0499	92.0	552	0.0278
0.0473	93.0	558	0.0305
0.0395	94.0	564	0.0244
0.0394	95.0	570	0.0589
0.0686	96.0	576	0.0294
0.0399	97.0	582	0.0388
0.0385	98.0	588	0.0144
0.0362	99.0	594	0.0128
0.0339	100.0	600	0.0140
0.0280	101.0	606	0.0172
0.0227	102.0	612	0.0222
0.0478	103.0	618	0.0100
0.0211	104.0	624	0.0095
0.0184	105.0	630	0.0078
0.0156	106.0	636	0.0088
0.0149	107.0	642	0.0067
0.0113	108.0	648	0.0059
0.0098	109.0	654	0.0052
0.0098	110.0	660	0.0076
0.0098	111.0	666	0.0065
0.0075	112.0	672	0.0072
0.0075	113.0	678	0.0054
0.0065	114.0	684	0.0069
0.0112	115.0	690	0.0062
0.0144	116.0	696	0.0057
0.0298	117.0	702	0.0093
0.0179	118.0	708	0.0088
0.0140	119.0	714	0.0071
0.0086	120.0	720	0.0060
0.0076	121.0	726	0.0035
0.0066	122.0	732	0.0040
0.0057	123.0	738	0.0036
0.0052	124.0	744	0.0028
0.0050	125.0	750	0.0034
0.0055	126.0	756	0.0076
0.0060	127.0	762	0.0029
0.0074	128.0	768	0.0040
0.0117	129.0	774	0.0065
0.0073	130.0	780	0.0056
0.0061	131.0	786	0.0048
0.0048	132.0	792	0.0050
0.0039	133.0	798	0.0043
0.0039	134.0	804	0.0042
0.0040	135.0	810	0.0049
0.0042	136.0	816	0.0047
0.0034	137.0	822	0.0040
0.0033	138.0	828	0.0038
0.0042	139.0	834	0.0037
0.0105	140.0	840	0.0043
0.0118	141.0	846	0.0042
0.0101	142.0	852	0.0026
0.0102	143.0	858	0.0049
0.0087	144.0	864	0.0048
0.0112	145.0	870	0.0039
0.0081	146.0	876	0.0039
0.0109	147.0	882	0.0033
0.0071	148.0	888	0.0029
0.0035	149.0	894	0.0039
0.0039	150.0	900	0.0035
0.0034	151.0	906	0.0033
0.0027	152.0	912	0.0035
0.0030	153.0	918	0.0050
0.0032	154.0	924	0.0073
0.0033	155.0	930	0.0067
0.0023	156.0	936	0.0051
0.0023	157.0	942	0.0038
0.0025	158.0	948	0.0027
0.0022	159.0	954	0.0031
0.0025	160.0	960	0.0037
0.0045	161.0	966	0.0035
0.0049	162.0	972	0.0053
0.0045	163.0	978	0.0046
0.0041	164.0	984	0.0054
0.0032	165.0	990	0.0055
0.0026	166.0	996	0.0049
0.0031	167.0	1002	0.0044
0.0043	168.0	1008	0.0039
0.0048	169.0	1014	0.0042
0.0051	170.0	1020	0.0030
0.0045	171.0	1026	0.0072
0.0080	172.0	1032	0.0047
0.0033	173.0	1038	0.0039
0.0034	174.0	1044	0.0043
0.0026	175.0	1050	0.0047
0.0026	176.0	1056	0.0049
0.0027	177.0	1062	0.0047
0.0021	178.0	1068	0.0044
0.0018	179.0	1074	0.0044
0.0019	180.0	1080	0.0042
0.0021	181.0	1086	0.0047
0.0020	182.0	1092	0.0054
0.0017	183.0	1098	0.0056
0.0018	184.0	1104	0.0053
0.0019	185.0	1110	0.0049
0.0016	186.0	1116	0.0048
0.0019	187.0	1122	0.0048
0.0020	188.0	1128	0.0047
0.0015	189.0	1134	0.0045
0.0024	190.0	1140	0.0045
0.0013	191.0	1146	0.0045
0.0017	192.0	1152	0.0046
0.0018	193.0	1158	0.0047
0.0013	194.0	1164	0.0047
0.0014	195.0	1170	0.0047
0.0014	196.0	1176	0.0047
0.0016	197.0	1182	0.0046
0.0013	198.0	1188	0.0046
0.0016	199.0	1194	0.0045
0.0016	200.0	1200	0.0045

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Downloads last month: 5

Safetensors

Model size

7.78M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support