train_wsc_1745950302

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wsc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3497
  • Num Input Tokens Seen: 14002704

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3297 1.6024 200 0.3604 70144
0.3381 3.2008 400 0.3677 140304
0.3502 4.8032 600 0.3497 210240
0.334 6.4016 800 0.3577 279952
0.3381 8.0 1000 0.3531 350224
0.3064 9.6024 1200 0.3915 420256
0.3082 11.2008 1400 0.7262 490496
0.2669 12.8032 1600 0.4497 560224
0.1512 14.4016 1800 1.0749 630560
0.3798 16.0 2000 1.4402 699648
0.0976 17.6024 2200 1.4632 769232
0.0 19.2008 2400 1.9862 839344
0.0 20.8032 2600 2.5075 909744
0.0 22.4016 2800 2.5122 979312
0.0 24.0 3000 2.5175 1049184
0.0 25.6024 3200 2.5244 1119552
0.0 27.2008 3400 2.5554 1189008
0.0 28.8032 3600 2.5826 1259168
0.0 30.4016 3800 2.5902 1329056
0.0 32.0 4000 2.6561 1399280
0.0 33.6024 4200 2.6548 1469920
0.0 35.2008 4400 2.6885 1539184
0.0 36.8032 4600 2.6695 1609648
0.0 38.4016 4800 2.6871 1679792
0.0 40.0 5000 2.7425 1749008
0.0 41.6024 5200 2.7435 1818832
0.0 43.2008 5400 2.7672 1889136
0.0 44.8032 5600 2.7891 1959008
0.0 46.4016 5800 2.8196 2028320
0.0 48.0 6000 2.8463 2098928
0.0 49.6024 6200 2.8755 2168688
0.0 51.2008 6400 2.8775 2238752
0.0 52.8032 6600 2.9044 2308816
0.0 54.4016 6800 2.9534 2379328
0.0 56.0 7000 2.9391 2448704
0.0 57.6024 7200 2.9866 2519008
0.0 59.2008 7400 3.0014 2588608
0.0 60.8032 7600 3.0447 2659072
0.0 62.4016 7800 3.0738 2728480
0.0 64.0 8000 3.1007 2798720
0.0 65.6024 8200 3.1122 2868672
0.0 67.2008 8400 3.1442 2939312
0.0 68.8032 8600 3.1569 3009568
0.0 70.4016 8800 3.1845 3079584
0.0 72.0 9000 3.2468 3149680
0.0 73.6024 9200 3.2682 3219680
0.0 75.2008 9400 3.3159 3289472
0.0 76.8032 9600 3.3048 3359520
0.0 78.4016 9800 3.3449 3429568
0.0 80.0 10000 3.3382 3499648
0.0 81.6024 10200 3.3461 3569504
0.0 83.2008 10400 3.3662 3639920
0.0 84.8032 10600 3.3628 3709520
0.0 86.4016 10800 3.3339 3779456
0.0 88.0 11000 3.3944 3849744
0.0 89.6024 11200 3.3391 3919984
0.0 91.2008 11400 3.3753 3989872
0.0 92.8032 11600 3.3929 4059568
0.0 94.4016 11800 3.4175 4129664
0.0 96.0 12000 3.4585 4199936
0.0 97.6024 12200 3.4331 4269952
0.0 99.2008 12400 3.4396 4339040
0.0 100.8032 12600 3.5001 4409680
0.0 102.4016 12800 3.4956 4479120
0.0 104.0 13000 3.4545 4548896
0.0 105.6024 13200 3.4611 4619216
0.0 107.2008 13400 3.5092 4689424
0.0 108.8032 13600 3.5048 4759232
0.0 110.4016 13800 3.5447 4829120
0.0 112.0 14000 3.5090 4899024
0.0 113.6024 14200 3.6206 4968944
0.0 115.2008 14400 3.5058 5039152
0.0 116.8032 14600 3.6088 5109312
0.0 118.4016 14800 3.5957 5179296
0.0 120.0 15000 3.6161 5249504
0.0 121.6024 15200 3.6471 5319424
0.0 123.2008 15400 3.6457 5389488
0.0 124.8032 15600 3.6882 5459776
0.0 126.4016 15800 3.8253 5529760
0.0 128.0 16000 3.8008 5599968
0.0 129.6024 16200 3.8438 5671056
0.0 131.2008 16400 3.9185 5740000
0.0 132.8032 16600 3.8750 5810288
0.0 134.4016 16800 3.8577 5880176
0.0 136.0 17000 3.8226 5950048
0.0 137.6024 17200 3.8837 6020016
0.0 139.2008 17400 4.0212 6090672
0.0 140.8032 17600 3.8864 6160288
0.0 142.4016 17800 3.9835 6230656
0.0 144.0 18000 4.0675 6299968
0.0 145.6024 18200 4.0625 6370512
0.0 147.2008 18400 4.0065 6440784
0.0 148.8032 18600 4.0656 6510560
0.0 150.4016 18800 4.0723 6579872
0.0 152.0 19000 4.1437 6650112
0.0 153.6024 19200 4.0051 6720368
0.0 155.2008 19400 4.0639 6790512
0.0 156.8032 19600 4.0434 6860880
0.0 158.4016 19800 4.1406 6930576
0.0 160.0 20000 4.0924 7000640
0.0 161.6024 20200 4.0968 7070272
0.0 163.2008 20400 4.1348 7140336
0.0 164.8032 20600 4.0855 7210816
0.0 166.4016 20800 4.0982 7281392
0.0 168.0 21000 4.0375 7350960
0.0 169.6024 21200 4.0815 7421312
0.0 171.2008 21400 4.0162 7491200
0.0 172.8032 21600 4.0089 7560976
0.0 174.4016 21800 4.0873 7631024
0.0 176.0 22000 4.0684 7700784
0.0 177.6024 22200 4.1046 7770752
0.0 179.2008 22400 4.0771 7840832
0.0 180.8032 22600 4.0975 7911072
0.0 182.4016 22800 4.1018 7981312
0.0 184.0 23000 4.0669 8050976
0.0 185.6024 23200 4.1246 8121312
0.0 187.2008 23400 4.1067 8191520
0.0 188.8032 23600 4.1109 8261456
0.0 190.4016 23800 4.1176 8331664
0.0 192.0 24000 4.1492 8401328
0.0 193.6024 24200 4.1182 8471232
0.0 195.2008 24400 4.1009 8540976
0.0 196.8032 24600 4.1390 8611296
0.0 198.4016 24800 4.1585 8681264
0.0 200.0 25000 4.1405 8751280
0.0 201.6024 25200 4.1875 8822192
0.0 203.2008 25400 4.2158 8891648
0.0 204.8032 25600 4.2012 8961760
0.0 206.4016 25800 4.1339 9031568
0.0 208.0 26000 4.1432 9101088
0.0 209.6024 26200 4.1395 9171168
0.0 211.2008 26400 4.1802 9240752
0.0 212.8032 26600 4.1210 9310960
0.0 214.4016 26800 4.1769 9380560
0.0 216.0 27000 4.1874 9450912
0.0 217.6024 27200 4.1946 9520832
0.0 219.2008 27400 4.1746 9590800
0.0 220.8032 27600 4.2181 9661456
0.0 222.4016 27800 4.2397 9731376
0.0 224.0 28000 4.1595 9801040
0.0 225.6024 28200 4.2398 9870784
0.0 227.2008 28400 4.1429 9941408
0.0 228.8032 28600 4.1808 10011264
0.0 230.4016 28800 4.2379 10080704
0.0 232.0 29000 4.2004 10150880
0.0 233.6024 29200 4.2302 10221616
0.0 235.2008 29400 4.2024 10291664
0.0 236.8032 29600 4.2644 10361728
0.0 238.4016 29800 4.2288 10431088
0.0 240.0 30000 4.2083 10501088
0.0 241.6024 30200 4.2136 10571488
0.0 243.2008 30400 4.2404 10640848
0.0 244.8032 30600 4.2041 10711136
0.0 246.4016 30800 4.1744 10781136
0.0 248.0 31000 4.2338 10851312
0.0 249.6024 31200 4.1842 10921664
0.0 251.2008 31400 4.2001 10991936
0.0 252.8032 31600 4.1989 11061680
0.0 254.4016 31800 4.2115 11131872
0.0 256.0 32000 4.2529 11201520
0.0 257.6024 32200 4.1891 11271952
0.0 259.2008 32400 4.2128 11340976
0.0 260.8032 32600 4.2195 11411056
0.0 262.4016 32800 4.2266 11481152
0.0 264.0 33000 4.2614 11550752
0.0 265.6024 33200 4.2402 11620752
0.0 267.2008 33400 4.2317 11690464
0.0 268.8032 33600 4.2460 11761360
0.0 270.4016 33800 4.2559 11831152
0.0 272.0 34000 4.2626 11900768
0.0 273.6024 34200 4.2211 11971616
0.0 275.2008 34400 4.2521 12041104
0.0 276.8032 34600 4.2759 12111712
0.0 278.4016 34800 4.2666 12181328
0.0 280.0 35000 4.2126 12251088
0.0 281.6024 35200 4.2194 12321616
0.0 283.2008 35400 4.2496 12391184
0.0 284.8032 35600 4.1983 12461088
0.0 286.4016 35800 4.2645 12531520
0.0 288.0 36000 4.2393 12600944
0.0 289.6024 36200 4.1907 12670544
0.0 291.2008 36400 4.2683 12741216
0.0 292.8032 36600 4.2292 12811584
0.0 294.4016 36800 4.2504 12881104
0.0 296.0 37000 4.1981 12951648
0.0 297.6024 37200 4.2314 13021600
0.0 299.2008 37400 4.2492 13091888
0.0 300.8032 37600 4.2149 13162128
0.0 302.4016 37800 4.2522 13231552
0.0 304.0 38000 4.2195 13302080
0.0 305.6024 38200 4.2168 13371808
0.0 307.2008 38400 4.2201 13441936
0.0 308.8032 38600 4.2468 13512304
0.0 310.4016 38800 4.2359 13582192
0.0 312.0 39000 4.2536 13652384
0.0 313.6024 39200 4.2690 13722224
0.0 315.2008 39400 4.2821 13791728
0.0 316.8032 39600 4.2679 13862560
0.0 318.4016 39800 4.2649 13933264
0.0 320.0 40000 4.2347 14002704

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wsc_1745950302

Adapter
(2100)
this model

Evaluation results