Configuration Parsing Warning:In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string

stldec_formulae

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 512
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 100
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
43.5204	0.6488	100	2.7061
39.2881	1.2920	200	2.2866
30.8042	1.9408	300	1.8044
25.1581	2.5839	400	1.4701
19.2347	3.2271	500	1.1261
15.416	3.8759	600	1.0358
14.1219	4.5191	700	0.9937
13.4197	5.1622	800	0.9650
12.9133	5.8110	900	0.9876
12.6179	6.4542	1000	0.9909
12.4532	7.0973	1100	0.9817
12.2832	7.7461	1200	0.9774
12.1959	8.3893	1300	0.9642
11.1151	9.0324	1400	0.9743
12.0355	9.6813	1500	0.9801
11.9798	10.3244	1600	0.9909
11.8785	10.9732	1700	0.9747
11.7593	11.6164	1800	0.9661
11.6373	12.2595	1900	0.9631
11.6234	12.9084	2000	0.9584
11.5039	13.5515	2100	0.9671
11.4137	14.1946	2200	0.9616
11.4176	14.8435	2300	0.9560
11.3459	15.4866	2400	0.9540
11.2998	16.1298	2500	0.9549
11.3421	16.7786	2600	0.9612
11.3012	17.4217	2700	0.9637
11.1974	18.0649	2800	0.9554
11.1949	18.7137	2900	0.9553
11.1927	19.3569	3000	0.9613
10.1945	20.0	3100	0.9594
11.2759	20.6488	3200	0.9606
11.2474	21.2920	3300	0.9599
11.2784	21.9408	3400	0.9553
11.1868	22.5839	3500	0.9520
11.1618	23.2271	3600	0.9541
11.131	23.8759	3700	0.9606
11.1007	24.5191	3800	0.9579
11.0605	25.1622	3900	0.9547
11.0824	25.8110	4000	0.9607
10.9615	26.4542	4100	0.9636
10.9831	27.0973	4200	0.9557
10.9606	27.7461	4300	0.9583
10.9256	28.3893	4400	0.9587
9.9608	29.0324	4500	0.9533
10.914	29.6813	4600	0.9461
10.9037	30.3244	4700	0.9550
10.8779	30.9732	4800	0.9478
10.8868	31.6164	4900	0.9626
10.8479	32.2595	5000	0.9578
10.8657	32.9084	5100	0.9577
10.8429	33.5515	5200	0.9620
10.7578	34.1946	5300	0.9580
10.7732	34.8435	5400	0.9553
10.8445	35.4866	5500	0.9528
10.7886	36.1298	5600	0.9541
10.8318	36.7786	5700	0.9555
10.7826	37.4217	5800	0.9536
10.7925	38.0649	5900	0.9559
10.7787	38.7137	6000	0.9534
10.7822	39.3569	6100	0.9543
9.8043	40.0	6200	0.9541