calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2737

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
1.3165 1.0 6 0.6014
0.5027 2.0 12 0.3749
0.3429 3.0 18 0.3721
0.3684 4.0 24 0.3056
0.3082 5.0 30 0.2904
0.2893 6.0 36 0.2771
0.2804 7.0 42 0.3063
0.2879 8.0 48 0.2761
0.2749 9.0 54 0.2786
0.2709 10.0 60 0.2812
0.2775 11.0 66 0.2772
0.2723 12.0 72 0.2790
0.2699 13.0 78 0.2743
0.2642 14.0 84 0.2763
0.2704 15.0 90 0.2740
0.2698 16.0 96 0.2747
0.2695 17.0 102 0.2741
0.2759 18.0 108 0.2738
0.2682 19.0 114 0.2756
0.2700 20.0 120 0.2749
0.2663 21.0 126 0.2737
0.2692 22.0 132 0.2750
0.2682 23.0 138 0.2738
0.2686 24.0 144 0.2752
0.2711 25.0 150 0.2734
0.2673 26.0 156 0.2743
0.2625 27.0 162 0.2761
0.2760 28.0 168 0.2734
0.2738 29.0 174 0.2738
0.2609 30.0 180 0.2754
0.2699 31.0 186 0.2758
0.2677 32.0 192 0.2733
0.2728 33.0 198 0.2735
0.2709 34.0 204 0.2734
0.2727 35.0 210 0.2733
0.2726 36.0 216 0.2733
0.2706 37.0 222 0.2733
0.2668 38.0 228 0.2734
0.2728 39.0 234 0.2737
0.2684 40.0 240 0.2737

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
75
Safetensors
Model size
7.78M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support