train_copa_1745950327

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0725
  • Num Input Tokens Seen: 10717440

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0003 2.2222 200 0.0725 53616
0.0 4.4444 400 0.1013 107088
0.0 6.6667 600 0.1114 160704
0.0 8.8889 800 0.1111 214352
0.0 11.1111 1000 0.1201 267952
0.0 13.3333 1200 0.1251 321488
0.0 15.5556 1400 0.1288 374992
0.0 17.7778 1600 0.1368 428624
0.0 20.0 1800 0.1357 482064
0.0 22.2222 2000 0.1427 535648
0.0 24.4444 2200 0.1443 589072
0.0 26.6667 2400 0.1494 642784
0.0 28.8889 2600 0.1529 696288
0.0 31.1111 2800 0.1540 749968
0.0 33.3333 3000 0.1550 803504
0.0 35.5556 3200 0.1598 857200
0.0 37.7778 3400 0.1638 910768
0.0 40.0 3600 0.1680 964400
0.0 42.2222 3800 0.1679 1017840
0.0 44.4444 4000 0.1694 1071552
0.0 46.6667 4200 0.1725 1125296
0.0 48.8889 4400 0.1767 1178960
0.0 51.1111 4600 0.1824 1232640
0.0 53.3333 4800 0.1815 1286048
0.0 55.5556 5000 0.1808 1339712
0.0 57.7778 5200 0.1815 1393248
0.0 60.0 5400 0.1883 1446832
0.0 62.2222 5600 0.1876 1500496
0.0 64.4444 5800 0.1875 1554112
0.0 66.6667 6000 0.1904 1607856
0.0 68.8889 6200 0.1959 1661408
0.0 71.1111 6400 0.1966 1714960
0.0 73.3333 6600 0.1956 1768352
0.0 75.5556 6800 0.1940 1821936
0.0 77.7778 7000 0.1989 1875424
0.0 80.0 7200 0.2019 1929008
0.0 82.2222 7400 0.2060 1982720
0.0 84.4444 7600 0.2131 2036336
0.0 86.6667 7800 0.2144 2089872
0.0 88.8889 8000 0.2146 2143520
0.0 91.1111 8200 0.2237 2197072
0.0 93.3333 8400 0.2267 2250672
0.0 95.5556 8600 0.2326 2304256
0.0 97.7778 8800 0.2336 2357840
0.0 100.0 9000 0.2301 2411392
0.0 102.2222 9200 0.2472 2464928
0.0 104.4444 9400 0.2497 2518544
0.0 106.6667 9600 0.2468 2572032
0.0 108.8889 9800 0.2477 2625568
0.0 111.1111 10000 0.2583 2679136
0.0 113.3333 10200 0.2625 2732608
0.0 115.5556 10400 0.2680 2786240
0.0 117.7778 10600 0.2693 2839920
0.0 120.0 10800 0.2724 2893488
0.0 122.2222 11000 0.2810 2947104
0.0 124.4444 11200 0.2809 3000560
0.0 126.6667 11400 0.2889 3054176
0.0 128.8889 11600 0.2932 3107744
0.0 131.1111 11800 0.2940 3161488
0.0 133.3333 12000 0.2971 3215088
0.0 135.5556 12200 0.3044 3268640
0.0 137.7778 12400 0.3089 3322144
0.0 140.0 12600 0.3134 3375792
0.0 142.2222 12800 0.3163 3429312
0.0 144.4444 13000 0.3199 3482800
0.0 146.6667 13200 0.3252 3536544
0.0 148.8889 13400 0.3294 3590208
0.0 151.1111 13600 0.3287 3643872
0.0 153.3333 13800 0.3371 3697456
0.0 155.5556 14000 0.3392 3751008
0.0 157.7778 14200 0.3396 3804608
0.0 160.0 14400 0.3430 3858240
0.0 162.2222 14600 0.3402 3911808
0.0 164.4444 14800 0.3435 3965376
0.0 166.6667 15000 0.3434 4018880
0.0 168.8889 15200 0.3440 4072432
0.0 171.1111 15400 0.3486 4125888
0.0 173.3333 15600 0.3475 4179552
0.0 175.5556 15800 0.3444 4233072
0.0 177.7778 16000 0.3511 4286672
0.0 180.0 16200 0.3479 4340240
0.0 182.2222 16400 0.3473 4393824
0.0 184.4444 16600 0.3475 4447408
0.0 186.6667 16800 0.3455 4500864
0.0 188.8889 17000 0.3489 4554512
0.0 191.1111 17200 0.3489 4608128
0.0 193.3333 17400 0.3447 4661856
0.0 195.5556 17600 0.3482 4715392
0.0 197.7778 17800 0.3478 4768912
0.0 200.0 18000 0.3459 4822464
0.0 202.2222 18200 0.3525 4876096
0.0 204.4444 18400 0.3472 4929776
0.0 206.6667 18600 0.3417 4983440
0.0 208.8889 18800 0.3493 5036880
0.0 211.1111 19000 0.3545 5090400
0.0 213.3333 19200 0.3598 5144016
0.0 215.5556 19400 0.3532 5197664
0.0 217.7778 19600 0.3744 5251232
0.0 220.0 19800 0.3673 5304880
0.0 222.2222 20000 0.3662 5358528
0.0 224.4444 20200 0.3707 5412064
0.0 226.6667 20400 0.3696 5465696
0.0 228.8889 20600 0.3809 5519328
0.0 231.1111 20800 0.3793 5572928
0.0 233.3333 21000 0.4159 5626480
0.0 235.5556 21200 0.3938 5680080
0.0 237.7778 21400 0.4079 5733584
0.0 240.0 21600 0.4172 5787248
0.0 242.2222 21800 0.3948 5840896
0.0 244.4444 22000 0.4262 5894480
0.0 246.6667 22200 0.4301 5948128
0.0 248.8889 22400 0.4266 6001664
0.0 251.1111 22600 0.4346 6055168
0.0 253.3333 22800 0.4392 6108640
0.0 255.5556 23000 0.4337 6162224
0.0 257.7778 23200 0.4418 6215760
0.0 260.0 23400 0.4470 6269472
0.0 262.2222 23600 0.4359 6323056
0.0 264.4444 23800 0.4404 6376544
0.0 266.6667 24000 0.4491 6430112
0.0 268.8889 24200 0.4434 6483760
0.0 271.1111 24400 0.4358 6537312
0.0 273.3333 24600 0.4445 6590736
0.0 275.5556 24800 0.4535 6644544
0.0 277.7778 25000 0.4534 6697952
0.0 280.0 25200 0.4438 6751696
0.0 282.2222 25400 0.4512 6805232
0.0 284.4444 25600 0.4250 6858992
0.0 286.6667 25800 0.4453 6912336
0.0 288.8889 26000 0.4419 6966000
0.0 291.1111 26200 0.4499 7019648
0.0 293.3333 26400 0.4486 7073328
0.0 295.5556 26600 0.4487 7126848
0.0 297.7778 26800 0.4207 7180368
0.0 300.0 27000 0.4536 7233952
0.0 302.2222 27200 0.4467 7287584
0.0 304.4444 27400 0.4454 7341280
0.0 306.6667 27600 0.4456 7394736
0.0 308.8889 27800 0.4563 7448256
0.0 311.1111 28000 0.4533 7501952
0.0 313.3333 28200 0.4547 7555536
0.0 315.5556 28400 0.4560 7608976
0.0 317.7778 28600 0.4504 7662624
0.0 320.0 28800 0.4488 7716176
0.0 322.2222 29000 0.4493 7769696
0.0 324.4444 29200 0.4496 7823248
0.0 326.6667 29400 0.4436 7876800
0.0 328.8889 29600 0.4618 7930352
0.0 331.1111 29800 0.4549 7984000
0.0 333.3333 30000 0.4654 8037664
0.0 335.5556 30200 0.4624 8091056
0.0 337.7778 30400 0.4536 8144624
0.0 340.0 30600 0.4483 8198256
0.0 342.2222 30800 0.4549 8251856
0.0 344.4444 31000 0.4589 8305456
0.0 346.6667 31200 0.4514 8359104
0.0 348.8889 31400 0.4638 8412784
0.0 351.1111 31600 0.4556 8466240
0.0 353.3333 31800 0.4556 8520000
0.0 355.5556 32000 0.4486 8573472
0.0 357.7778 32200 0.4440 8627184
0.0 360.0 32400 0.4510 8680880
0.0 362.2222 32600 0.4462 8734512
0.0 364.4444 32800 0.4542 8788064
0.0 366.6667 33000 0.4525 8841744
0.0 368.8889 33200 0.4509 8895200
0.0 371.1111 33400 0.4594 8948880
0.0 373.3333 33600 0.4626 9002400
0.0 375.5556 33800 0.4579 9056032
0.0 377.7778 34000 0.4491 9109600
0.0 380.0 34200 0.4419 9163168
0.0 382.2222 34400 0.4531 9216832
0.0 384.4444 34600 0.4478 9270352
0.0 386.6667 34800 0.4401 9324080
0.0 388.8889 35000 0.4501 9377712
0.0 391.1111 35200 0.4469 9431360
0.0 393.3333 35400 0.4492 9484880
0.0 395.5556 35600 0.4465 9538464
0.0 397.7778 35800 0.4539 9592208
0.0 400.0 36000 0.4516 9645776
0.0 402.2222 36200 0.4517 9699488
0.0 404.4444 36400 0.4503 9753088
0.0 406.6667 36600 0.4596 9806544
0.0 408.8889 36800 0.4507 9859984
0.0 411.1111 37000 0.4464 9913568
0.0 413.3333 37200 0.4531 9967168
0.0 415.5556 37400 0.4477 10020864
0.0 417.7778 37600 0.4555 10074384
0.0 420.0 37800 0.4573 10127968
0.0 422.2222 38000 0.4566 10181584
0.0 424.4444 38200 0.4376 10235168
0.0 426.6667 38400 0.4496 10288720
0.0 428.8889 38600 0.4555 10342320
0.0 431.1111 38800 0.4566 10395824
0.0 433.3333 39000 0.4421 10449408
0.0 435.5556 39200 0.4479 10503040
0.0 437.7778 39400 0.4565 10556640
0.0 440.0 39600 0.4516 10610256
0.0 442.2222 39800 0.4621 10663840
0.0 444.4444 40000 0.4530 10717440

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1745950327

Adapter
(2100)
this model

Evaluation results