Visualize in Weights & Biases

exceptions_exp2_resemble_to_push_frequency_1032

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5762
  • Accuracy: 0.3671

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 1032
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.8077 0.2912 1000 4.7368 0.2569
4.3355 0.5825 2000 4.2787 0.3002
4.1543 0.8737 3000 4.0961 0.3152
3.9969 1.1648 4000 3.9885 0.3255
3.9235 1.4561 5000 3.9142 0.3318
3.8897 1.7473 6000 3.8568 0.3369
3.7427 2.0384 7000 3.8139 0.3415
3.7463 2.3297 8000 3.7830 0.3447
3.7415 2.6209 9000 3.7528 0.3472
3.7289 2.9122 10000 3.7289 0.3495
3.6325 3.2033 11000 3.7154 0.3517
3.6268 3.4945 12000 3.6962 0.3535
3.6435 3.7858 13000 3.6783 0.3549
3.5409 4.0769 14000 3.6700 0.3560
3.5812 4.3681 15000 3.6613 0.3573
3.5711 4.6594 16000 3.6464 0.3587
3.5748 4.9506 17000 3.6347 0.3596
3.5007 5.2417 18000 3.6364 0.3600
3.5337 5.5330 19000 3.6263 0.3612
3.5421 5.8242 20000 3.6151 0.3621
3.4317 6.1153 21000 3.6184 0.3624
3.4691 6.4066 22000 3.6099 0.3631
3.4836 6.6978 23000 3.5993 0.3637
3.4874 6.9890 24000 3.5905 0.3647
3.4321 7.2802 25000 3.5986 0.3646
3.4369 7.5714 26000 3.5912 0.3653
3.4537 7.8627 27000 3.5794 0.3662
3.3668 8.1538 28000 3.5914 0.3661
3.4093 8.4450 29000 3.5846 0.3664
3.431 8.7363 30000 3.5762 0.3671
3.3184 9.0274 31000 3.5798 0.3674
3.377 9.3186 32000 3.5809 0.3672
3.3939 9.6099 33000 3.5724 0.3678
3.4109 9.9011 34000 3.5611 0.3685
3.3373 10.1922 35000 3.5772 0.3682
3.3609 10.4835 36000 3.5684 0.3685
3.4038 10.7747 37000 3.5599 0.3691
3.2944 11.0658 38000 3.5725 0.3690
3.3442 11.3571 39000 3.5660 0.3691
3.3584 11.6483 40000 3.5603 0.3698
3.3727 11.9395 41000 3.5518 0.3700
3.2891 12.2307 42000 3.5657 0.3697
3.3365 12.5219 43000 3.5583 0.3700
3.3517 12.8131 44000 3.5518 0.3705
3.2635 13.1043 45000 3.5615 0.3704
3.3246 13.3955 46000 3.5549 0.3705
3.326 13.6867 47000 3.5490 0.3709
3.3341 13.9780 48000 3.5432 0.3714
3.2913 14.2691 49000 3.5589 0.3709
3.2971 14.5603 50000 3.5523 0.3709
3.3271 14.8516 51000 3.5445 0.3714
3.2435 15.1427 52000 3.5589 0.3713
3.281 15.4339 53000 3.5551 0.3716
3.3023 15.7252 54000 3.5442 0.3719
3.2045 16.0163 55000 3.5526 0.3717
3.2608 16.3075 56000 3.5529 0.3716
3.2856 16.5988 57000 3.5463 0.3720
3.2891 16.8900 58000 3.5398 0.3726
3.2247 17.1812 59000 3.5556 0.3718
3.2549 17.4724 60000 3.5466 0.3721
3.2807 17.7636 61000 3.5407 0.3727
3.1888 18.0548 62000 3.5530 0.3723
3.2371 18.3460 63000 3.5507 0.3722
3.2615 18.6372 64000 3.5429 0.3731
3.2662 18.9285 65000 3.5351 0.3733
3.2062 19.2196 66000 3.5497 0.3728
3.242 19.5108 67000 3.5484 0.3726
3.2734 19.8021 68000 3.5384 0.3733
3.1755 20.0932 69000 3.5544 0.3727
3.2174 20.3844 70000 3.5512 0.3728
3.2381 20.6757 71000 3.5423 0.3736
3.2437 20.9669 72000 3.5343 0.3738
3.204 21.2580 73000 3.5536 0.3726
3.2134 21.5493 74000 3.5462 0.3731
3.2241 21.8405 75000 3.5392 0.3737
3.1705 22.1316 76000 3.5515 0.3733
3.2019 22.4229 77000 3.5471 0.3737
3.2065 22.7141 78000 3.5403 0.3739
3.194 23.0052 79000 3.5451 0.3738
3.1815 23.2965 80000 3.5515 0.3735
3.218 23.5877 81000 3.5433 0.3741
3.2422 23.8790 82000 3.5338 0.3743
3.1459 24.1701 83000 3.5529 0.3735
3.1993 24.4613 84000 3.5506 0.3736
3.2121 24.7526 85000 3.5380 0.3745
3.1188 25.0437 86000 3.5527 0.3739
3.1541 25.3349 87000 3.5510 0.3739
3.182 25.6262 88000 3.5436 0.3743
3.2141 25.9174 89000 3.5343 0.3747
3.1395 26.2085 90000 3.5566 0.3738
3.1696 26.4998 91000 3.5447 0.3743
3.1961 26.7910 92000 3.5415 0.3744
3.1084 27.0821 93000 3.5532 0.3741
3.1561 27.3734 94000 3.5487 0.3745
3.1751 27.6646 95000 3.5432 0.3745
3.1853 27.9558 96000 3.5360 0.3749
3.139 28.2470 97000 3.5545 0.3739
3.1499 28.5382 98000 3.5472 0.3746
3.1747 28.8295 99000 3.5374 0.3749
3.1073 29.1206 100000 3.5580 0.3740
3.1268 29.4118 101000 3.5532 0.3743
3.1554 29.7031 102000 3.5421 0.3749

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support