Visualize in Weights & Biases

exceptions_exp2_resemble_to_drop_frequency_1032

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5608
  • Accuracy: 0.3697

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 1032
  • gradient_accumulation_steps: 5
  • total_train_batch_size: 80
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
4.8179 0.2912 1000 4.7388 0.2568
4.3391 0.5824 2000 4.2838 0.2992
4.1469 0.8737 3000 4.0994 0.3146
3.9916 1.1648 4000 3.9897 0.3247
3.9303 1.4561 5000 3.9140 0.3321
3.8779 1.7473 6000 3.8589 0.3370
3.7486 2.0384 7000 3.8123 0.3417
3.7408 2.3297 8000 3.7838 0.3445
3.7353 2.6209 9000 3.7540 0.3475
3.7216 2.9121 10000 3.7265 0.3500
3.627 3.2033 11000 3.7146 0.3516
3.6338 3.4945 12000 3.6967 0.3533
3.6322 3.7857 13000 3.6785 0.3548
3.5546 4.0769 14000 3.6719 0.3559
3.5751 4.3681 15000 3.6628 0.3569
3.5864 4.6593 16000 3.6476 0.3583
3.5727 4.9506 17000 3.6343 0.3598
3.5001 5.2417 18000 3.6371 0.3599
3.5292 5.5329 19000 3.6272 0.3609
3.5401 5.8242 20000 3.6149 0.3623
3.4454 6.1153 21000 3.6226 0.3623
3.4806 6.4065 22000 3.6113 0.3629
3.4971 6.6978 23000 3.6041 0.3635
3.4885 6.9890 24000 3.5931 0.3645
3.4323 7.2802 25000 3.6029 0.3644
3.4491 7.5714 26000 3.5926 0.3649
3.4585 7.8626 27000 3.5874 0.3659
3.3888 8.1538 28000 3.5957 0.3654
3.431 8.4450 29000 3.5858 0.3663
3.4303 8.7362 30000 3.5769 0.3668
3.3287 9.0274 31000 3.5821 0.3670
3.3702 9.3186 32000 3.5814 0.3673
3.3958 9.6098 33000 3.5711 0.3675
3.4182 9.9010 34000 3.5638 0.3683
3.3301 10.1922 35000 3.5756 0.3681
3.3746 10.4834 36000 3.5671 0.3685
3.3964 10.7747 37000 3.5621 0.3688
3.2846 11.0658 38000 3.5708 0.3690
3.3397 11.3570 39000 3.5662 0.3690
3.3675 11.6483 40000 3.5608 0.3697
3.3618 11.9395 41000 3.5535 0.3700
3.2947 12.2306 42000 3.5662 0.3694
3.3348 12.5219 43000 3.5599 0.3700
3.3386 12.8131 44000 3.5525 0.3705
3.2812 13.1043 45000 3.5638 0.3701
3.3061 13.3955 46000 3.5608 0.3699
3.3265 13.6867 47000 3.5513 0.3709
3.3405 13.9779 48000 3.5451 0.3712
3.2761 14.2691 49000 3.5578 0.3708
3.3001 14.5603 50000 3.5523 0.3713
3.3161 14.8515 51000 3.5440 0.3718
3.2441 15.1427 52000 3.5590 0.3707
3.2836 15.4339 53000 3.5512 0.3716
3.3039 15.7251 54000 3.5427 0.3721
3.1908 16.0163 55000 3.5534 0.3716
3.2603 16.3075 56000 3.5535 0.3717
3.2797 16.5988 57000 3.5465 0.3722
3.3015 16.8900 58000 3.5399 0.3728
3.214 17.1811 59000 3.5574 0.3718
3.2536 17.4724 60000 3.5466 0.3723
3.2649 17.7636 61000 3.5422 0.3726
3.1855 18.0547 62000 3.5523 0.3722
3.234 18.3460 63000 3.5505 0.3724
3.2567 18.6372 64000 3.5431 0.3728
3.2589 18.9284 65000 3.5341 0.3735
3.2112 19.2196 66000 3.5503 0.3726
3.2412 19.5108 67000 3.5466 0.3730
3.248 19.8020 68000 3.5371 0.3736
3.1563 20.0932 69000 3.5551 0.3727
3.217 20.3844 70000 3.5472 0.3732
3.2238 20.6756 71000 3.5425 0.3733
3.2515 20.9669 72000 3.5328 0.3739
3.195 21.2580 73000 3.5512 0.3728
3.2048 21.5492 74000 3.5449 0.3733
3.2347 21.8405 75000 3.5347 0.3742
3.1672 22.1316 76000 3.5518 0.3730
3.1982 22.4229 77000 3.5476 0.3735
3.2172 22.7141 78000 3.5398 0.3737
3.1973 23.0052 79000 3.5512 0.3734
3.167 23.2965 80000 3.5488 0.3736
3.2075 23.5877 81000 3.5396 0.3741
3.2149 23.8789 82000 3.5350 0.3744
3.1563 24.1701 83000 3.5525 0.3734
3.1832 24.4613 84000 3.5431 0.3740
3.1941 24.7525 85000 3.5368 0.3745
3.1068 25.0437 86000 3.5506 0.3738
3.1546 25.3349 87000 3.5476 0.3741
3.1857 25.6261 88000 3.5417 0.3743
3.21 25.9174 89000 3.5322 0.3751
3.1305 26.2085 90000 3.5505 0.3742
3.1673 26.4997 91000 3.5474 0.3740
3.1799 26.7910 92000 3.5412 0.3745
3.1123 27.0821 93000 3.5531 0.3741
3.1516 27.3733 94000 3.5482 0.3744
3.1699 27.6646 95000 3.5420 0.3744
3.193 27.9558 96000 3.5334 0.3751
3.1241 28.2470 97000 3.5512 0.3743
3.1584 28.5382 98000 3.5444 0.3747
3.1838 28.8294 99000 3.5358 0.3755
3.1031 29.1206 100000 3.5505 0.3745
3.1339 29.4118 101000 3.5469 0.3748
3.1536 29.7030 102000 3.5419 0.3749
3.1805 29.9942 103000 3.5314 0.3755
3.1183 30.2854 104000 3.5509 0.3745
3.129 30.5766 105000 3.5427 0.3751
3.146 30.8678 106000 3.5382 0.3752
3.1008 31.1590 107000 3.5534 0.3746
3.1194 31.4502 108000 3.5460 0.3749
3.1194 31.7415 109000 3.5423 0.3752
3.0636 32.0326 110000 3.5524 0.3747
3.1026 32.3238 111000 3.5523 0.3749
3.1189 32.6151 112000 3.5448 0.3755
3.1413 32.9063 113000 3.5360 0.3757
3.0919 33.1974 114000 3.5544 0.3749
3.0985 33.4887 115000 3.5457 0.3752
3.1181 33.7799 116000 3.5391 0.3757
3.0525 34.0711 117000 3.5525 0.3750
3.0902 34.3623 118000 3.5551 0.3749
3.1092 34.6535 119000 3.5476 0.3755
3.1216 34.9447 120000 3.5423 0.3756
3.0606 35.2359 121000 3.5529 0.3751
3.0835 35.5271 122000 3.5473 0.3756
3.1028 35.8183 123000 3.5453 0.3757

Framework versions

  • Transformers 4.55.2
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support