train_wic_1745950290

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wic dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2121
  • Num Input Tokens Seen: 12716696

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3476 0.1637 200 0.2516 63344
0.4373 0.3275 400 0.2390 126720
0.2262 0.4912 600 0.2437 190304
0.2825 0.6549 800 0.2133 254384
0.1255 0.8187 1000 0.2226 318128
0.304 0.9824 1200 0.2473 381920
0.2219 1.1457 1400 0.2425 445096
0.3132 1.3095 1600 0.2414 508744
0.1823 1.4732 1800 0.2225 572408
0.1589 1.6369 2000 0.2585 635736
0.2242 1.8007 2200 0.2577 699464
0.2306 1.9644 2400 0.2121 763192
0.3907 2.1277 2600 0.2789 826784
0.2222 2.2914 2800 0.3554 890336
0.1247 2.4552 3000 0.2748 953840
0.1097 2.6189 3200 0.3896 1017600
0.1378 2.7826 3400 0.4007 1081104
0.1817 2.9464 3600 0.3245 1144576
0.0012 3.1097 3800 0.4846 1208440
0.065 3.2734 4000 0.4780 1272216
0.004 3.4372 4200 0.4518 1335496
0.0017 3.6009 4400 0.4055 1398984
0.1065 3.7646 4600 0.3807 1462856
0.1932 3.9284 4800 0.3905 1526280
0.0002 4.0917 5000 0.6511 1589584
0.0001 4.2554 5200 0.5897 1653024
0.0005 4.4192 5400 0.5866 1716432
0.0006 4.5829 5600 0.6255 1779984
0.0003 4.7466 5800 0.7438 1843936
0.124 4.9104 6000 0.4447 1907808
0.2039 5.0737 6200 0.6593 1971048
0.0 5.2374 6400 0.7280 2034808
0.1956 5.4011 6600 0.5445 2098088
0.0 5.5649 6800 0.8355 2161640
0.0552 5.7286 7000 0.4419 2225432
0.0571 5.8923 7200 0.5244 2289032
0.0676 6.0557 7400 0.5892 2352656
0.0 6.2194 7600 0.7670 2416160
0.0002 6.3831 7800 0.6973 2479728
0.0 6.5469 8000 1.0138 2543168
0.0 6.7106 8200 0.8165 2606560
0.0001 6.8743 8400 0.6571 2670208
0.0002 7.0377 8600 0.6532 2733584
0.0002 7.2014 8800 0.7033 2797008
0.0 7.3651 9000 0.7892 2860576
0.0207 7.5289 9200 0.7275 2924256
0.0005 7.6926 9400 0.5924 2988272
0.0002 7.8563 9600 0.8637 3051776
0.0001 8.0196 9800 0.6596 3114992
0.0001 8.1834 10000 0.6833 3179200
0.0001 8.3471 10200 0.7279 3242496
0.0 8.5108 10400 0.6826 3306112
0.0 8.6746 10600 0.6635 3369760
0.0002 8.8383 10800 0.6550 3433360
0.0004 9.0016 11000 0.5340 3496680
0.0853 9.1654 11200 0.7227 3560648
0.0003 9.3291 11400 0.5790 3624200
0.0 9.4928 11600 0.7232 3687560
0.0001 9.6566 11800 0.7750 3751288
0.0001 9.8203 12000 0.7379 3814952
0.0 9.9840 12200 0.8288 3878120
0.0001 10.1474 12400 0.8097 3941616
0.0001 10.3111 12600 0.7123 4005216
0.0 10.4748 12800 0.9077 4068912
0.0 10.6386 13000 0.7398 4132608
0.0 10.8023 13200 0.7638 4196096
0.0002 10.9660 13400 0.6866 4259680
0.2094 11.1293 13600 0.7035 4323128
0.0 11.2931 13800 0.8195 4386856
0.0001 11.4568 14000 0.8012 4450296
0.0001 11.6205 14200 0.7842 4513544
0.0 11.7843 14400 0.8652 4576984
0.1456 11.9480 14600 0.7277 4640904
0.0 12.1113 14800 0.8191 4704360
0.0 12.2751 15000 0.7924 4768152
0.0 12.4388 15200 0.8922 4832152
0.0 12.6025 15400 0.9474 4895192
0.0001 12.7663 15600 0.6687 4959112
0.0 12.9300 15800 0.7818 5022408
0.0 13.0933 16000 0.7858 5086016
0.0001 13.2571 16200 0.6874 5149920
0.0 13.4208 16400 0.8706 5213296
0.0 13.5845 16600 0.9227 5276672
0.0 13.7483 16800 0.8173 5340624
0.0 13.9120 17000 0.8825 5403792
0.0 14.0753 17200 0.7801 5466936
0.0 14.2391 17400 0.9103 5530392
0.1469 14.4028 17600 0.9201 5593576
0.0 14.5665 17800 0.8883 5657288
0.0 14.7302 18000 0.9377 5721496
0.0 14.8940 18200 1.0536 5785096
0.0 15.0573 18400 1.0416 5848736
0.2094 15.2210 18600 1.0437 5912176
0.0004 15.3848 18800 0.7817 5976400
0.2063 15.5485 19000 0.8517 6040272
0.0 15.7122 19200 0.8170 6103424
0.0 15.8760 19400 0.7808 6166912
0.0 16.0393 19600 0.9356 6230320
0.0 16.2030 19800 0.8314 6294224
0.0 16.3668 20000 0.8287 6357984
0.0 16.5305 20200 0.8734 6421344
0.0001 16.6942 20400 0.6544 6485152
0.0 16.8580 20600 0.7260 6548768
0.0002 17.0213 20800 0.7262 6611792
0.0 17.1850 21000 0.7560 6675216
0.0 17.3488 21200 0.8159 6739088
0.0001 17.5125 21400 0.7390 6802352
0.0 17.6762 21600 0.8322 6866160
0.0 17.8400 21800 0.8785 6929936
0.0 18.0033 22000 0.8250 6993168
0.0 18.1670 22200 0.8725 7057008
0.0 18.3307 22400 0.9009 7120624
0.0 18.4945 22600 0.9297 7183872
0.0 18.6582 22800 0.9585 7247952
0.0 18.8219 23000 0.9874 7311488
0.0 18.9857 23200 1.0058 7374848
0.0 19.1490 23400 1.0229 7438160
0.0 19.3127 23600 1.0449 7501872
0.0 19.4765 23800 1.0629 7565520
0.0 19.6402 24000 1.0834 7629488
0.0 19.8039 24200 1.0971 7692992
0.0 19.9677 24400 1.1133 7756512
0.0 20.1310 24600 1.1195 7819816
0.0 20.2947 24800 1.1258 7883800
0.0 20.4585 25000 1.1465 7947944
0.0 20.6222 25200 1.1553 8011336
0.0 20.7859 25400 1.1654 8075000
0.0 20.9497 25600 1.1781 8138568
0.0 21.1130 25800 1.1810 8201872
0.0 21.2767 26000 1.1954 8265168
0.0 21.4404 26200 1.2150 8328704
0.0 21.6042 26400 1.2241 8392144
0.0 21.7679 26600 1.2185 8456096
0.0 21.9316 26800 1.2335 8519872
0.0 22.0950 27000 1.2356 8583464
0.0 22.2587 27200 1.2526 8646840
0.0 22.4224 27400 1.2589 8710600
0.0 22.5862 27600 1.2740 8774344
0.0 22.7499 27800 1.2897 8838024
0.0 22.9136 28000 1.2926 8901832
0.0 23.0770 28200 1.3024 8965184
0.0 23.2407 28400 1.3161 9028576
0.0 23.4044 28600 1.3127 9092256
0.0 23.5682 28800 1.3288 9155872
0.0 23.7319 29000 1.3275 9219312
0.0 23.8956 29200 1.3366 9283264
0.0 24.0589 29400 1.3375 9346992
0.0 24.2227 29600 1.3357 9410880
0.0 24.3864 29800 1.3381 9474704
0.0 24.5501 30000 1.3346 9538160
0.0 24.7139 30200 1.3497 9601792
0.0 24.8776 30400 1.3458 9664976
0.0 25.0409 30600 1.3536 9728232
0.0 25.2047 30800 1.3514 9791848
0.0 25.3684 31000 1.3532 9855400
0.0 25.5321 31200 1.3611 9918984
0.0 25.6959 31400 1.3571 9982872
0.0 25.8596 31600 1.3662 10046056
0.0 26.0229 31800 1.3653 10109568
0.0 26.1867 32000 1.3696 10173072
0.0 26.3504 32200 1.3687 10236512
0.0 26.5141 32400 1.3599 10299920
0.0 26.6779 32600 1.3592 10363808
0.0 26.8416 32800 1.3636 10427744
0.0 27.0049 33000 1.3804 10491384
0.0 27.1686 33200 1.3716 10555192
0.0 27.3324 33400 1.3719 10619080
0.0 27.4961 33600 1.3744 10682424
0.0 27.6598 33800 1.3782 10746024
0.0 27.8236 34000 1.3901 10809736
0.0 27.9873 34200 1.3770 10873448
0.0 28.1506 34400 1.3859 10936704
0.0 28.3144 34600 1.3836 11000112
0.0 28.4781 34800 1.3876 11063936
0.0 28.6418 35000 1.3929 11128160
0.0 28.8056 35200 1.3806 11191600
0.0 28.9693 35400 1.4023 11255184
0.0 29.1326 35600 1.3910 11318640
0.0 29.2964 35800 1.3930 11382352
0.0 29.4601 36000 1.4022 11446048
0.0 29.6238 36200 1.3920 11509328
0.0 29.7876 36400 1.4017 11573312
0.0 29.9513 36600 1.4065 11636752
0.0 30.1146 36800 1.4076 11700056
0.0 30.2783 37000 1.4148 11763352
0.0 30.4421 37200 1.4033 11826952
0.0 30.6058 37400 1.4028 11890888
0.0 30.7695 37600 1.4313 11954296
0.0 30.9333 37800 1.4140 12017784
0.0 31.0966 38000 1.4216 12081304
0.0 31.2603 38200 1.4222 12145240
0.0 31.4241 38400 1.4194 12208888
0.0 31.5878 38600 1.4188 12272344
0.0 31.7515 38800 1.4094 12335960
0.0 31.9153 39000 1.4163 12399064
0.0 32.0786 39200 1.4245 12462200
0.0 32.2423 39400 1.4164 12526024
0.0 32.4061 39600 1.4199 12589496
0.0 32.5698 39800 1.4315 12653080
0.0 32.7335 40000 1.4047 12716696

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wic_1745950290

Adapter
(2105)
this model

Evaluation results