add_grad_lr5e-4_batch128_train17_eval16

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0154
  • Accuracy: 0.9679

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 128
  • eval_batch_size: 512
  • seed: 23452399
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0 0 2.7193 0.0
2.3496 0.0032 50 2.3562 0.0
2.3333 0.0064 100 2.3112 0.0
2.3107 0.0096 150 2.2973 0.0
2.5122 0.0128 200 2.3337 0.0
2.2414 0.016 250 2.2790 0.0
2.2631 0.0192 300 2.2312 0.0
2.193 0.0224 350 2.1866 0.0
2.1956 0.0256 400 2.2871 0.0
2.3296 0.0288 450 2.2533 0.0
2.1544 0.032 500 2.1678 0.0
2.1619 0.0352 550 2.1428 0.0
2.1578 0.0384 600 2.1160 0.0
2.0805 0.0416 650 2.1378 0.0
2.1379 0.0448 700 2.1923 0.0
2.1105 0.048 750 2.0977 0.0
2.0908 0.0512 800 2.1495 0.0
2.1163 0.0544 850 2.0832 0.0
2.1117 0.0576 900 2.2732 0.0
2.1783 0.0608 950 2.1508 0.0
2.0848 0.064 1000 2.0197 0.0
1.9972 0.0672 1050 1.9375 0.0
1.8876 0.0704 1100 2.0081 0.0
1.8901 0.0736 1150 1.8045 0.0
1.7229 0.0768 1200 1.7674 0.0
1.8196 0.08 1250 1.9117 0.0
1.7262 0.0832 1300 1.9402 0.0
1.8496 0.0864 1350 1.7986 0.0
1.6415 0.0896 1400 1.6836 0.0
1.8074 0.0928 1450 1.6811 0.0
2.015 0.096 1500 2.0060 0.0
1.3949 0.0992 1550 1.7472 0.0
1.3884 0.1024 1600 1.5806 0.0001
1.3993 0.1056 1650 1.5009 0.0002
1.3922 0.1088 1700 1.3913 0.0014
1.3647 0.112 1750 1.3974 0.0002
1.3262 0.1152 1800 1.3437 0.0013
1.5573 0.1184 1850 1.6870 0.0001
1.3919 0.1216 1900 1.4983 0.0
1.3264 0.1248 1950 1.4067 0.0025
1.5054 0.128 2000 1.4885 0.0001
1.4031 0.1312 2050 1.7518 0.0003
1.3538 0.1344 2100 1.3727 0.0021
1.408 0.1376 2150 1.4749 0.0005
1.2191 0.1408 2200 1.3767 0.0015
1.3448 0.144 2250 1.6277 0.0007
1.3086 0.1472 2300 1.3881 0.002
1.529 0.1504 2350 1.3379 0.001
1.433 0.1536 2400 1.3497 0.0015
1.4327 0.1568 2450 1.3468 0.0015
1.213 0.16 2500 1.3858 0.004
1.2029 0.1632 2550 1.2968 0.0018
1.1857 0.1664 2600 1.4007 0.0019
1.1877 0.1696 2650 1.2652 0.0038
1.2281 0.1728 2700 1.2081 0.0039
1.1379 0.176 2750 1.3022 0.0036
1.2458 0.1792 2800 1.2313 0.0036
1.114 0.1824 2850 1.3706 0.0044
1.2269 0.1856 2900 1.4575 0.0032
1.1221 0.1888 2950 1.2294 0.0018
1.2582 0.192 3000 1.1945 0.0044
1.1009 0.1952 3050 1.3201 0.0056
1.1572 0.1984 3100 1.3459 0.0045
1.2497 0.2016 3150 1.3372 0.0047
1.2188 0.2048 3200 1.4481 0.0039
1.2417 0.208 3250 1.2609 0.0034
1.31 0.2112 3300 1.3970 0.003
1.2816 0.2144 3350 1.4225 0.0034
1.1755 0.2176 3400 1.2073 0.004
1.2698 0.2208 3450 1.2303 0.005
1.1897 0.224 3500 1.7764 0.0004
1.1248 0.2272 3550 1.4761 0.0024
1.1501 0.2304 3600 1.3269 0.0039
1.2388 0.2336 3650 1.3460 0.0061
1.2204 0.2368 3700 1.6976 0.0006
1.1572 0.24 3750 1.3979 0.005
1.1617 0.2432 3800 1.3160 0.0053
1.1609 0.2464 3850 1.1778 0.0058
1.1811 0.2496 3900 2.0811 0.0003
1.1343 0.2528 3950 1.2704 0.0064
1.1775 0.256 4000 1.2977 0.005
1.1971 0.2592 4050 1.2857 0.0035
1.1541 0.2624 4100 1.2041 0.0071
1.2055 0.2656 4150 1.7311 0.0059
1.082 0.2688 4200 1.2451 0.0045
1.1024 0.272 4250 1.2491 0.0054
1.1704 0.2752 4300 1.4003 0.0079
1.1118 0.2784 4350 1.3309 0.0072
1.1553 0.2816 4400 1.2228 0.0102
1.0607 0.2848 4450 1.4611 0.0026
1.2239 0.288 4500 1.1127 0.0071
1.0648 0.2912 4550 1.4199 0.0089
1.1317 0.2944 4600 1.2533 0.0108
1.3889 0.2976 4650 1.7725 0.0
1.1914 0.3008 4700 1.3731 0.0092
1.1129 0.304 4750 1.1425 0.0128
1.1046 0.3072 4800 1.1032 0.0091
1.2364 0.3104 4850 1.1844 0.0088
1.013 0.3136 4900 1.2415 0.0083
0.8808 0.3168 4950 0.9291 0.0089
0.9546 0.32 5000 1.2228 0.0036
0.8682 0.3232 5050 1.2901 0.0092
0.9839 0.3264 5100 1.4013 0.0066
0.5929 0.3296 5150 0.5889 0.0295
0.7361 0.3328 5200 1.3741 0.0038
0.4741 0.336 5250 1.0211 0.003
0.4972 0.3392 5300 0.4394 0.0297
0.687 0.3424 5350 0.6737 0.009
0.3421 0.3456 5400 0.9852 0.0095
0.2695 0.3488 5450 0.8747 0.0306
0.3682 0.352 5500 1.0271 0.01
0.5233 0.3552 5550 0.3635 0.0536
0.2067 0.3584 5600 0.4997 0.0372
0.4716 0.3616 5650 1.0124 0.0192
0.2115 0.3648 5700 0.3417 0.068
0.1741 0.368 5750 0.5971 0.0381
0.5953 0.3712 5800 0.3584 0.0459
0.158 0.3744 5850 0.5634 0.024
0.2555 0.3776 5900 0.9047 0.0052
0.3028 0.3808 5950 0.4474 0.0349
0.4047 0.384 6000 0.7320 0.0332
0.0819 0.3872 6050 0.7320 0.1607
0.055 0.3904 6100 0.1029 0.6687
0.3385 0.3936 6150 0.4378 0.3088
0.0659 0.3968 6200 0.1799 0.5442
0.0279 0.4 6250 0.2499 0.3857
0.349 0.4032 6300 0.6052 0.0537
0.201 0.4064 6350 0.3670 0.3453
0.188 0.4096 6400 0.2504 0.4782
0.0179 0.4128 6450 0.1064 0.6577
0.0967 0.416 6500 0.3356 0.5355
0.1234 0.4192 6550 0.1511 0.5542
0.057 0.4224 6600 0.3994 0.3332
0.0906 0.4256 6650 0.0901 0.7709
0.0816 0.4288 6700 0.5050 0.3015
0.0847 0.432 6750 0.1828 0.6641
0.0977 0.4352 6800 0.3029 0.4298
0.0249 0.4384 6850 0.2638 0.6238
0.0274 0.4416 6900 0.0834 0.7076
0.0468 0.4448 6950 0.1984 0.5072
0.0402 0.448 7000 0.2242 0.5004
0.0339 0.4512 7050 0.1273 0.7642
0.0681 0.4544 7100 0.2063 0.6743
0.0159 0.4576 7150 0.0512 0.8216
0.0453 0.4608 7200 0.0305 0.873
0.0293 0.464 7250 0.1257 0.7176
0.0825 0.4672 7300 0.1057 0.7127
0.0413 0.4704 7350 0.2132 0.5256
0.0222 0.4736 7400 0.0819 0.7338
0.0446 0.4768 7450 0.1115 0.619
0.0472 0.48 7500 0.1122 0.7335
0.0168 0.4832 7550 0.0228 0.8987
0.0143 0.4864 7600 0.0174 0.9336
0.0926 0.4896 7650 0.2201 0.5565
0.0449 0.4928 7700 0.0199 0.9279
0.0122 0.496 7750 0.0204 0.9205
0.0291 0.4992 7800 0.0217 0.913
0.0249 0.5024 7850 0.0330 0.8872
0.0101 0.5056 7900 0.0563 0.7469
0.0029 0.5088 7950 0.0160 0.9347
0.0748 0.512 8000 0.0773 0.6639
0.048 0.5152 8050 0.0882 0.7123
0.0309 0.5184 8100 0.0436 0.8712
0.0223 0.5216 8150 0.0210 0.9245
0.0218 0.5248 8200 0.0531 0.8543
0.08 0.528 8250 0.2518 0.3675
0.0201 0.5312 8300 0.0381 0.8501
0.0052 0.5344 8350 0.0458 0.8214
0.0464 0.5376 8400 0.0550 0.8367
0.0098 0.5408 8450 0.2861 0.5415
0.0199 0.544 8500 0.0207 0.909
0.0076 0.5472 8550 0.0174 0.9512
0.023 0.5504 8600 0.0179 0.9215
0.0129 0.5536 8650 0.0401 0.9076
0.0096 0.5568 8700 0.0285 0.9029
0.002 0.56 8750 0.0055 0.9769
0.0112 0.5632 8800 0.0285 0.8932
0.0111 0.5664 8850 0.0094 0.9653
0.0096 0.5696 8900 0.0056 0.9804
0.008 0.5728 8950 0.0202 0.9102
0.0172 0.576 9000 0.0071 0.9773
0.0366 0.5792 9050 0.1993 0.485
0.008 0.5824 9100 0.0092 0.9691
0.0281 0.5856 9150 0.0980 0.8018
0.0093 0.5888 9200 0.1377 0.7855
0.0149 0.592 9250 0.0335 0.8934
0.019 0.5952 9300 0.0274 0.885
0.0148 0.5984 9350 0.1392 0.7484
0.0237 0.6016 9400 0.0190 0.928
0.0178 0.6048 9450 0.1124 0.8137
0.003 0.608 9500 0.0180 0.9243
0.0029 0.6112 9550 0.0642 0.8391
0.0031 0.6144 9600 0.0185 0.9172
0.0049 0.6176 9650 0.0528 0.8807
0.0191 0.6208 9700 0.0061 0.9773
0.0205 0.624 9750 0.0043 0.9807
0.0019 0.6272 9800 0.0045 0.9805
0.0031 0.6304 9850 0.0303 0.9045
0.0009 0.6336 9900 0.0133 0.9596
0.0024 0.6368 9950 0.0269 0.8941
0.001 0.64 10000 0.0045 0.9839
0.0131 0.6432 10050 0.0843 0.8034
0.0033 0.6464 10100 0.0224 0.941
0.0028 0.6496 10150 0.0009 0.9967
0.0115 0.6528 10200 0.0274 0.9282
0.0016 0.656 10250 0.0244 0.9264
0.0005 0.6592 10300 0.0145 0.9401
0.0027 0.6624 10350 0.0203 0.9173
0.0024 0.6656 10400 0.0362 0.8816
0.0012 0.6688 10450 0.0147 0.9644
0.0009 0.672 10500 0.0094 0.9601
0.0009 0.6752 10550 0.0408 0.8971
0.003 0.6784 10600 0.0695 0.8411
0.0062 0.6816 10650 0.0060 0.9744
0.0149 0.6848 10700 0.1436 0.7635
0.0013 0.688 10750 0.0308 0.9219
0.0001 0.6912 10800 0.0090 0.9726
0.0002 0.6944 10850 0.0019 0.9906
0.0005 0.6976 10900 0.0164 0.9537
0.0014 0.7008 10950 0.0027 0.9903
0.0003 0.704 11000 0.0625 0.8677
0.0001 0.7072 11050 0.0289 0.9497
0.0 0.7104 11100 0.0009 0.9953
0.0032 0.7136 11150 0.0029 0.9894
0.0015 0.7168 11200 0.1071 0.8827
0.0001 0.72 11250 0.0003 0.999
0.0001 0.7232 11300 0.0050 0.978
0.0003 0.7264 11350 0.0026 0.9904
0.0 0.7296 11400 0.0086 0.9699
0.0 0.7328 11450 0.0741 0.8955
0.0001 0.736 11500 0.0001 0.9996
0.0 0.7392 11550 0.0152 0.9596
0.0003 0.7424 11600 0.0089 0.9739
0.0 0.7456 11650 0.0253 0.9411
0.0028 0.7488 11700 0.0472 0.9112
0.0 0.752 11750 0.0061 0.9664
0.0 0.7552 11800 0.0025 0.988
0.0 0.7584 11850 0.0010 0.9954
0.0 0.7616 11900 0.0018 0.9911
0.0 0.7648 11950 0.0011 0.9951
0.0001 0.768 12000 0.0076 0.9757
0.0 0.7712 12050 0.0011 0.9958
0.0 0.7744 12100 0.0006 0.9972
0.0 0.7776 12150 0.0006 0.9974
0.0 0.7808 12200 0.0005 0.9974
0.0 0.784 12250 0.0004 0.998
0.0 0.7872 12300 0.0006 0.9973
0.0 0.7904 12350 0.0007 0.9967
0.0 0.7936 12400 0.0008 0.9964
0.0 0.7968 12450 0.0005 0.9979
0.0 0.8 12500 0.0005 0.998
0.0 0.8032 12550 0.0004 0.9981
0.0001 0.8064 12600 0.0153 0.9756
0.0 0.8096 12650 0.0005 0.9974
0.0004 0.8128 12700 0.0001 0.9993
0.0 0.816 12750 0.0061 0.977
0.0001 0.8192 12800 0.0378 0.918
0.0 0.8224 12850 0.0100 0.9737
0.0 0.8256 12900 0.0151 0.9626
0.0021 0.8288 12950 0.0203 0.9538
0.0 0.832 13000 0.0159 0.9614
0.0 0.8352 13050 0.0172 0.9586
0.0 0.8384 13100 0.0141 0.9635
0.0 0.8416 13150 0.0114 0.9677
0.0 0.8448 13200 0.0114 0.9695
0.0 0.848 13250 0.0071 0.9784
0.0 0.8512 13300 0.0052 0.9834
0.0 0.8544 13350 0.0065 0.9804
0.0 0.8576 13400 0.0071 0.9793
0.0 0.8608 13450 0.0073 0.9784
0.0 0.864 13500 0.0293 0.9458
0.0 0.8672 13550 0.0009 0.997
0.0 0.8704 13600 0.0009 0.997
0.0 0.8736 13650 0.0409 0.9368
0.0 0.8768 13700 0.0655 0.9229
0.0 0.88 13750 0.0569 0.93
0.0 0.8832 13800 0.0523 0.9331
0.0 0.8864 13850 0.0497 0.935
0.0 0.8896 13900 0.0520 0.9352
0.0 0.8928 13950 0.0516 0.9356
0.0 0.896 14000 0.0143 0.9656
0.0 0.8992 14050 0.0115 0.9702
0.0 0.9024 14100 0.0104 0.9721
0.0 0.9056 14150 0.0863 0.9043
0.0 0.9088 14200 0.0537 0.9266
0.0 0.912 14250 0.0464 0.932
0.0 0.9152 14300 0.0420 0.9348
0.0 0.9184 14350 0.0387 0.9382
0.0 0.9216 14400 0.0319 0.9442
0.0 0.9248 14450 0.0274 0.9489
0.0 0.928 14500 0.0263 0.9506
0.0 0.9312 14550 0.0255 0.9515
0.0 0.9344 14600 0.0247 0.9533
0.0 0.9376 14650 0.0230 0.9563
0.0 0.9408 14700 0.0217 0.9578
0.0 0.944 14750 0.0210 0.9587
0.0 0.9472 14800 0.0207 0.9588
0.0 0.9504 14850 0.0201 0.9597
0.0 0.9536 14900 0.0189 0.9619
0.0 0.9568 14950 0.0166 0.9657
0.0 0.96 15000 0.0161 0.967
0.0 0.9632 15050 0.0159 0.9673
0.0 0.9664 15100 0.0157 0.9679
0.0 0.9696 15150 0.0156 0.9679
0.0 0.9728 15200 0.0155 0.968
0.0 0.976 15250 0.0155 0.968
0.0 0.9792 15300 0.0155 0.968
0.0 0.9824 15350 0.0155 0.968
0.0 0.9856 15400 0.0155 0.9679
0.0 0.9888 15450 0.0154 0.9679
0.0 0.992 15500 0.0154 0.9679
0.0 0.9952 15550 0.0154 0.9679
0.0 0.9984 15600 0.0154 0.9679

Framework versions

  • Transformers 4.46.0
  • Pytorch 2.5.1
  • Datasets 3.1.0
  • Tokenizers 0.20.1
Downloads last month
12
Safetensors
Model size
10.7M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results