Configuration Parsing Warning:In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string

stldec_formulae

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9541

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 512
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
43.5204 0.6488 100 2.7061
39.2881 1.2920 200 2.2866
30.8042 1.9408 300 1.8044
25.1581 2.5839 400 1.4701
19.2347 3.2271 500 1.1261
15.416 3.8759 600 1.0358
14.1219 4.5191 700 0.9937
13.4197 5.1622 800 0.9650
12.9133 5.8110 900 0.9876
12.6179 6.4542 1000 0.9909
12.4532 7.0973 1100 0.9817
12.2832 7.7461 1200 0.9774
12.1959 8.3893 1300 0.9642
11.1151 9.0324 1400 0.9743
12.0355 9.6813 1500 0.9801
11.9798 10.3244 1600 0.9909
11.8785 10.9732 1700 0.9747
11.7593 11.6164 1800 0.9661
11.6373 12.2595 1900 0.9631
11.6234 12.9084 2000 0.9584
11.5039 13.5515 2100 0.9671
11.4137 14.1946 2200 0.9616
11.4176 14.8435 2300 0.9560
11.3459 15.4866 2400 0.9540
11.2998 16.1298 2500 0.9549
11.3421 16.7786 2600 0.9612
11.3012 17.4217 2700 0.9637
11.1974 18.0649 2800 0.9554
11.1949 18.7137 2900 0.9553
11.1927 19.3569 3000 0.9613
10.1945 20.0 3100 0.9594
11.2759 20.6488 3200 0.9606
11.2474 21.2920 3300 0.9599
11.2784 21.9408 3400 0.9553
11.1868 22.5839 3500 0.9520
11.1618 23.2271 3600 0.9541
11.131 23.8759 3700 0.9606
11.1007 24.5191 3800 0.9579
11.0605 25.1622 3900 0.9547
11.0824 25.8110 4000 0.9607
10.9615 26.4542 4100 0.9636
10.9831 27.0973 4200 0.9557
10.9606 27.7461 4300 0.9583
10.9256 28.3893 4400 0.9587
9.9608 29.0324 4500 0.9533
10.914 29.6813 4600 0.9461
10.9037 30.3244 4700 0.9550
10.8779 30.9732 4800 0.9478
10.8868 31.6164 4900 0.9626
10.8479 32.2595 5000 0.9578
10.8657 32.9084 5100 0.9577
10.8429 33.5515 5200 0.9620
10.7578 34.1946 5300 0.9580
10.7732 34.8435 5400 0.9553
10.8445 35.4866 5500 0.9528
10.7886 36.1298 5600 0.9541
10.8318 36.7786 5700 0.9555
10.7826 37.4217 5800 0.9536
10.7925 38.0649 5900 0.9559
10.7787 38.7137 6000 0.9534
10.7822 39.3569 6100 0.9543
9.8043 40.0 6200 0.9541

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.4.2
  • Tokenizers 0.22.1
Downloads last month
1,325
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support