train_copa_1745950325

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0824
  • Num Input Tokens Seen: 10717440

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1337 2.2222 200 0.1672 53616
0.0975 4.4444 400 0.1372 107088
0.0866 6.6667 600 0.1211 160704
0.0855 8.8889 800 0.1080 214352
0.1471 11.1111 1000 0.1004 267952
0.0901 13.3333 1200 0.0935 321488
0.0327 15.5556 1400 0.0905 374992
0.0138 17.7778 1600 0.0858 428624
0.0428 20.0 1800 0.0844 482064
0.0255 22.2222 2000 0.0824 535648
0.0165 24.4444 2200 0.0854 589072
0.0113 26.6667 2400 0.0851 642784
0.0044 28.8889 2600 0.0911 696288
0.0054 31.1111 2800 0.0979 749968
0.0015 33.3333 3000 0.1028 803504
0.0061 35.5556 3200 0.1121 857200
0.0069 37.7778 3400 0.1169 910768
0.0001 40.0 3600 0.1307 964400
0.0016 42.2222 3800 0.1314 1017840
0.0001 44.4444 4000 0.1435 1071552
0.0004 46.6667 4200 0.1421 1125296
0.0003 48.8889 4400 0.1445 1178960
0.0002 51.1111 4600 0.1505 1232640
0.0 53.3333 4800 0.1503 1286048
0.0001 55.5556 5000 0.1514 1339712
0.0 57.7778 5200 0.1601 1393248
0.0001 60.0 5400 0.1729 1446832
0.0 62.2222 5600 0.1631 1500496
0.0001 64.4444 5800 0.1721 1554112
0.0 66.6667 6000 0.1722 1607856
0.0 68.8889 6200 0.1667 1661408
0.0 71.1111 6400 0.1704 1714960
0.0 73.3333 6600 0.1807 1768352
0.0 75.5556 6800 0.1890 1821936
0.0 77.7778 7000 0.1759 1875424
0.0 80.0 7200 0.1917 1929008
0.0 82.2222 7400 0.1907 1982720
0.0 84.4444 7600 0.1963 2036336
0.0 86.6667 7800 0.1953 2089872
0.0 88.8889 8000 0.2050 2143520
0.0 91.1111 8200 0.1875 2197072
0.0 93.3333 8400 0.2040 2250672
0.0 95.5556 8600 0.1922 2304256
0.0 97.7778 8800 0.2129 2357840
0.0 100.0 9000 0.2154 2411392
0.0 102.2222 9200 0.2191 2464928
0.0 104.4444 9400 0.2123 2518544
0.0 106.6667 9600 0.2196 2572032
0.0 108.8889 9800 0.2102 2625568
0.0 111.1111 10000 0.2195 2679136
0.0 113.3333 10200 0.2241 2732608
0.0 115.5556 10400 0.2215 2786240
0.0 117.7778 10600 0.2178 2839920
0.0 120.0 10800 0.2362 2893488
0.0 122.2222 11000 0.2346 2947104
0.0 124.4444 11200 0.2243 3000560
0.0 126.6667 11400 0.2243 3054176
0.0 128.8889 11600 0.2318 3107744
0.0 131.1111 11800 0.2312 3161488
0.0 133.3333 12000 0.2331 3215088
0.0 135.5556 12200 0.2364 3268640
0.0 137.7778 12400 0.2402 3322144
0.0 140.0 12600 0.2436 3375792
0.0 142.2222 12800 0.2556 3429312
0.0 144.4444 13000 0.2603 3482800
0.0 146.6667 13200 0.2580 3536544
0.0 148.8889 13400 0.2616 3590208
0.0 151.1111 13600 0.2471 3643872
0.0 153.3333 13800 0.2646 3697456
0.0 155.5556 14000 0.2594 3751008
0.0 157.7778 14200 0.2656 3804608
0.0 160.0 14400 0.2697 3858240
0.0 162.2222 14600 0.2536 3911808
0.0 164.4444 14800 0.2809 3965376
0.0 166.6667 15000 0.2686 4018880
0.0 168.8889 15200 0.2652 4072432
0.0 171.1111 15400 0.2478 4125888
0.0 173.3333 15600 0.2732 4179552
0.0 175.5556 15800 0.2766 4233072
0.0 177.7778 16000 0.2752 4286672
0.0 180.0 16200 0.2860 4340240
0.0 182.2222 16400 0.2637 4393824
0.0 184.4444 16600 0.2694 4447408
0.0 186.6667 16800 0.2886 4500864
0.0 188.8889 17000 0.2796 4554512
0.0 191.1111 17200 0.2903 4608128
0.0 193.3333 17400 0.2787 4661856
0.0 195.5556 17600 0.2786 4715392
0.0 197.7778 17800 0.2808 4768912
0.0 200.0 18000 0.2824 4822464
0.0 202.2222 18200 0.2906 4876096
0.0 204.4444 18400 0.2834 4929776
0.0 206.6667 18600 0.2819 4983440
0.0 208.8889 18800 0.2900 5036880
0.0 211.1111 19000 0.2909 5090400
0.0 213.3333 19200 0.2962 5144016
0.0 215.5556 19400 0.2868 5197664
0.0 217.7778 19600 0.3036 5251232
0.0 220.0 19800 0.3029 5304880
0.0 222.2222 20000 0.2858 5358528
0.0 224.4444 20200 0.3009 5412064
0.0 226.6667 20400 0.3049 5465696
0.0 228.8889 20600 0.3086 5519328
0.0 231.1111 20800 0.3139 5572928
0.0 233.3333 21000 0.3247 5626480
0.0 235.5556 21200 0.3193 5680080
0.0 237.7778 21400 0.3144 5733584
0.0 240.0 21600 0.3176 5787248
0.0 242.2222 21800 0.3127 5840896
0.0 244.4444 22000 0.3292 5894480
0.0 246.6667 22200 0.3189 5948128
0.0 248.8889 22400 0.3260 6001664
0.0 251.1111 22600 0.3143 6055168
0.0 253.3333 22800 0.3331 6108640
0.0 255.5556 23000 0.3314 6162224
0.0 257.7778 23200 0.3060 6215760
0.0 260.0 23400 0.3246 6269472
0.0 262.2222 23600 0.3205 6323056
0.0 264.4444 23800 0.3191 6376544
0.0 266.6667 24000 0.3075 6430112
0.0 268.8889 24200 0.3452 6483760
0.0 271.1111 24400 0.3326 6537312
0.0 273.3333 24600 0.3257 6590736
0.0 275.5556 24800 0.3345 6644544
0.0 277.7778 25000 0.3235 6697952
0.0 280.0 25200 0.3314 6751696
0.0 282.2222 25400 0.3287 6805232
0.0 284.4444 25600 0.3304 6858992
0.0 286.6667 25800 0.3015 6912336
0.0 288.8889 26000 0.3161 6966000
0.0 291.1111 26200 0.3290 7019648
0.0 293.3333 26400 0.3013 7073328
0.0 295.5556 26600 0.3308 7126848
0.0 297.7778 26800 0.3054 7180368
0.0 300.0 27000 0.3248 7233952
0.0 302.2222 27200 0.3389 7287584
0.0 304.4444 27400 0.3211 7341280
0.0 306.6667 27600 0.3116 7394736
0.0 308.8889 27800 0.2985 7448256
0.0 311.1111 28000 0.3244 7501952
0.0 313.3333 28200 0.3313 7555536
0.0 315.5556 28400 0.3346 7608976
0.0 317.7778 28600 0.3129 7662624
0.0 320.0 28800 0.3398 7716176
0.0 322.2222 29000 0.3377 7769696
0.0 324.4444 29200 0.3275 7823248
0.0 326.6667 29400 0.3356 7876800
0.0 328.8889 29600 0.3324 7930352
0.0 331.1111 29800 0.3293 7984000
0.0 333.3333 30000 0.3017 8037664
0.0 335.5556 30200 0.3117 8091056
0.0 337.7778 30400 0.3345 8144624
0.0 340.0 30600 0.3273 8198256
0.0 342.2222 30800 0.3251 8251856
0.0 344.4444 31000 0.3138 8305456
0.0 346.6667 31200 0.3180 8359104
0.0 348.8889 31400 0.3191 8412784
0.0 351.1111 31600 0.2937 8466240
0.0 353.3333 31800 0.3253 8520000
0.0 355.5556 32000 0.3078 8573472
0.0 357.7778 32200 0.3109 8627184
0.0 360.0 32400 0.3303 8680880
0.0 362.2222 32600 0.3220 8734512
0.0 364.4444 32800 0.3162 8788064
0.0 366.6667 33000 0.3011 8841744
0.0 368.8889 33200 0.3381 8895200
0.0 371.1111 33400 0.3190 8948880
0.0 373.3333 33600 0.3231 9002400
0.0 375.5556 33800 0.3396 9056032
0.0 377.7778 34000 0.3361 9109600
0.0 380.0 34200 0.3345 9163168
0.0 382.2222 34400 0.3211 9216832
0.0 384.4444 34600 0.3231 9270352
0.0 386.6667 34800 0.3059 9324080
0.0 388.8889 35000 0.3365 9377712
0.0 391.1111 35200 0.3063 9431360
0.0 393.3333 35400 0.3130 9484880
0.0 395.5556 35600 0.3314 9538464
0.0 397.7778 35800 0.3232 9592208
0.0 400.0 36000 0.3262 9645776
0.0 402.2222 36200 0.3091 9699488
0.0 404.4444 36400 0.3318 9753088
0.0 406.6667 36600 0.3262 9806544
0.0 408.8889 36800 0.3052 9859984
0.0 411.1111 37000 0.2946 9913568
0.0 413.3333 37200 0.3138 9967168
0.0 415.5556 37400 0.3066 10020864
0.0 417.7778 37600 0.3149 10074384
0.0 420.0 37800 0.3040 10127968
0.0 422.2222 38000 0.3279 10181584
0.0 424.4444 38200 0.2994 10235168
0.0 426.6667 38400 0.2891 10288720
0.0 428.8889 38600 0.3334 10342320
0.0 431.1111 38800 0.3324 10395824
0.0 433.3333 39000 0.3376 10449408
0.0 435.5556 39200 0.3396 10503040
0.0 437.7778 39400 0.3407 10556640
0.0 440.0 39600 0.3407 10610256
0.0 442.2222 39800 0.3407 10663840
0.0 444.4444 40000 0.3407 10717440

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1745950325

Adapter
(2100)
this model

Evaluation results