Visualize in Weights & Biases

exceptions_exp2_last_to_carry_frequency_1032

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5603
  • Accuracy: 0.3693

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 1032
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.8127 0.2913 1000 4.7312 0.2579
4.3304 0.5826 2000 4.2763 0.3000
4.1529 0.8739 3000 4.0966 0.3153
3.9983 1.1652 4000 3.9904 0.3250
3.9249 1.4565 5000 3.9173 0.3314
3.8903 1.7477 6000 3.8589 0.3369
3.7635 2.0390 7000 3.8180 0.3408
3.7702 2.3303 8000 3.7853 0.3442
3.7354 2.6216 9000 3.7552 0.3471
3.7273 2.9129 10000 3.7297 0.3495
3.6258 3.2042 11000 3.7159 0.3515
3.6523 3.4955 12000 3.6990 0.3531
3.6454 3.7868 13000 3.6826 0.3547
3.5384 4.0781 14000 3.6727 0.3560
3.5642 4.3694 15000 3.6624 0.3569
3.5822 4.6606 16000 3.6478 0.3582
3.5878 4.9519 17000 3.6379 0.3592
3.5126 5.2432 18000 3.6379 0.3600
3.5281 5.5345 19000 3.6286 0.3609
3.5384 5.8258 20000 3.6171 0.3618
3.4365 6.1171 21000 3.6180 0.3622
3.477 6.4084 22000 3.6142 0.3626
3.4841 6.6997 23000 3.6056 0.3636
3.493 6.9910 24000 3.5955 0.3642
3.4211 7.2823 25000 3.6025 0.3645
3.4501 7.5736 26000 3.5919 0.3650
3.4646 7.8648 27000 3.5827 0.3657
3.382 8.1561 28000 3.5925 0.3657
3.4023 8.4474 29000 3.5831 0.3661
3.4206 8.7387 30000 3.5779 0.3670
3.3284 9.0300 31000 3.5839 0.3669
3.3706 9.3213 32000 3.5810 0.3672
3.4008 9.6126 33000 3.5719 0.3678
3.4016 9.9039 34000 3.5663 0.3684
3.344 10.1952 35000 3.5757 0.3680
3.3658 10.4865 36000 3.5698 0.3684
3.3864 10.7777 37000 3.5615 0.3688
3.292 11.0690 38000 3.5681 0.3690
3.3273 11.3603 39000 3.5691 0.3689
3.3604 11.6516 40000 3.5603 0.3693
3.3662 11.9429 41000 3.5509 0.3703
3.3145 12.2342 42000 3.5626 0.3693
3.3408 12.5255 43000 3.5566 0.3703
3.3589 12.8168 44000 3.5519 0.3702
3.2725 13.1081 45000 3.5610 0.3702
3.318 13.3994 46000 3.5582 0.3703
3.342 13.6906 47000 3.5490 0.3712
3.3467 13.9819 48000 3.5447 0.3715
3.2671 14.2732 49000 3.5590 0.3705
3.3074 14.5645 50000 3.5491 0.3712
3.3175 14.8558 51000 3.5418 0.3718
3.2397 15.1471 52000 3.5566 0.3714
3.2702 15.4384 53000 3.5540 0.3715
3.2823 15.7297 54000 3.5401 0.3721
3.2059 16.0210 55000 3.5513 0.3719
3.2616 16.3123 56000 3.5520 0.3720
3.2886 16.6036 57000 3.5430 0.3725
3.3 16.8948 58000 3.5367 0.3728
3.2212 17.1861 59000 3.5522 0.3720
3.2577 17.4774 60000 3.5459 0.3724
3.274 17.7687 61000 3.5362 0.3730
3.191 18.0600 62000 3.5555 0.3721
3.2522 18.3513 63000 3.5486 0.3727
3.2584 18.6426 64000 3.5407 0.3729
3.276 18.9339 65000 3.5341 0.3733
3.2164 19.2252 66000 3.5519 0.3724
3.2382 19.5165 67000 3.5460 0.3732
3.2541 19.8077 68000 3.5347 0.3737
3.1753 20.0990 69000 3.5532 0.3732
3.2181 20.3903 70000 3.5479 0.3732
3.2277 20.6816 71000 3.5395 0.3736
3.254 20.9729 72000 3.5303 0.3741
3.1843 21.2642 73000 3.5451 0.3734
3.2214 21.5555 74000 3.5389 0.3737
3.2321 21.8468 75000 3.5346 0.3741
3.1704 22.1381 76000 3.5458 0.3735
3.1986 22.4294 77000 3.5432 0.3738
3.2118 22.7207 78000 3.5374 0.3741
3.145 23.0119 79000 3.5437 0.3740
3.1741 23.3032 80000 3.5464 0.3739
3.1999 23.5945 81000 3.5405 0.3744
3.2158 23.8858 82000 3.5319 0.3746
3.1511 24.1771 83000 3.5495 0.3742
3.1905 24.4684 84000 3.5426 0.3743
3.1898 24.7597 85000 3.5386 0.3747
3.1296 25.0510 86000 3.5496 0.3739
3.1601 25.3423 87000 3.5479 0.3742
3.1941 25.6336 88000 3.5339 0.3748
3.2105 25.9248 89000 3.5294 0.3751
3.1418 26.2161 90000 3.5476 0.3742
3.1615 26.5074 91000 3.5427 0.3745
3.1996 26.7987 92000 3.5348 0.3749
3.1119 27.0900 93000 3.5491 0.3745
3.1546 27.3813 94000 3.5435 0.3746
3.1648 27.6726 95000 3.5378 0.3748
3.1949 27.9639 96000 3.5324 0.3755
3.1301 28.2552 97000 3.5506 0.3745
3.1508 28.5465 98000 3.5417 0.3750
3.1696 28.8378 99000 3.5343 0.3751
3.1182 29.1290 100000 3.5521 0.3746
3.1367 29.4203 101000 3.5465 0.3749
3.1549 29.7116 102000 3.5365 0.3752
3.1447 30.0029 103000 3.5441 0.3751
3.1155 30.2942 104000 3.5475 0.3748
3.136 30.5855 105000 3.5417 0.3753
3.1496 30.8768 106000 3.5334 0.3756
3.0895 31.1681 107000 3.5510 0.3749
3.1302 31.4594 108000 3.5446 0.3753
3.1378 31.7507 109000 3.5379 0.3755

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support