train_record_1745950250

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4438
  • Num Input Tokens Seen: 54198768

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
1.0376 0.0064 200 1.5287 272992
1.1312 0.0128 400 1.1516 541536
1.0905 0.0192 600 0.9675 813648
0.6506 0.0256 800 0.8528 1084496
0.5621 0.0320 1000 0.7838 1355472
0.7799 0.0384 1200 0.7391 1624048
0.7229 0.0448 1400 0.7115 1893968
0.6776 0.0512 1600 0.6931 2163024
0.5712 0.0576 1800 0.6745 2436032
0.6222 0.0640 2000 0.6617 2706960
1.1155 0.0704 2200 0.6501 2976144
0.6136 0.0768 2400 0.6399 3248384
0.6635 0.0832 2600 0.6310 3519088
0.6066 0.0896 2800 0.6240 3790208
0.5585 0.0960 3000 0.6168 4059472
0.5138 0.1024 3200 0.6087 4331088
0.539 0.1088 3400 0.6047 4601728
0.4948 0.1152 3600 0.5978 4877104
0.5902 0.1216 3800 0.5923 5150656
0.512 0.1280 4000 0.5888 5422944
0.6774 0.1344 4200 0.5858 5692368
0.6641 0.1408 4400 0.5806 5965440
0.3172 0.1472 4600 0.5787 6237632
0.4636 0.1536 4800 0.5753 6506256
0.6575 0.1600 5000 0.5704 6779376
0.5575 0.1664 5200 0.5675 7051504
0.5267 0.1728 5400 0.5632 7321552
0.5367 0.1792 5600 0.5594 7592304
0.5104 0.1856 5800 0.5570 7865632
0.4609 0.1920 6000 0.5556 8135936
0.601 0.1985 6200 0.5515 8408624
0.5025 0.2049 6400 0.5481 8677888
0.5747 0.2113 6600 0.5448 8947120
0.4186 0.2177 6800 0.5422 9216336
0.4501 0.2241 7000 0.5399 9485568
0.4739 0.2305 7200 0.5383 9758160
0.6005 0.2369 7400 0.5366 10028256
0.5946 0.2433 7600 0.5359 10300544
0.4071 0.2497 7800 0.5331 10574192
0.7391 0.2561 8000 0.5299 10844928
0.4257 0.2625 8200 0.5278 11114800
0.5219 0.2689 8400 0.5254 11383280
0.6746 0.2753 8600 0.5235 11652336
0.5384 0.2817 8800 0.5215 11924224
0.4862 0.2881 9000 0.5199 12194800
0.4256 0.2945 9200 0.5187 12466288
0.431 0.3009 9400 0.5154 12735104
0.4597 0.3073 9600 0.5135 13003216
0.4979 0.3137 9800 0.5123 13273680
0.6121 0.3201 10000 0.5108 13545840
0.5801 0.3265 10200 0.5098 13817104
0.4489 0.3329 10400 0.5083 14088032
0.5318 0.3393 10600 0.5066 14361280
0.4673 0.3457 10800 0.5040 14631040
0.443 0.3521 11000 0.5028 14901648
0.4586 0.3585 11200 0.5015 15170800
0.6426 0.3649 11400 0.5003 15440592
0.5997 0.3713 11600 0.4979 15710608
0.3011 0.3777 11800 0.4967 15980176
0.676 0.3841 12000 0.4954 16249072
0.3801 0.3905 12200 0.4941 16522704
0.7394 0.3969 12400 0.4925 16794064
0.4433 0.4033 12600 0.4916 17062288
0.5369 0.4097 12800 0.4899 17331072
0.3812 0.4161 13000 0.4896 17599616
0.2993 0.4225 13200 0.4891 17869424
0.4166 0.4289 13400 0.4874 18141136
0.494 0.4353 13600 0.4855 18414272
0.4413 0.4417 13800 0.4849 18685264
0.4482 0.4481 14000 0.4831 18957072
0.3391 0.4545 14200 0.4814 19230480
0.3414 0.4609 14400 0.4816 19503472
0.5706 0.4673 14600 0.4807 19777344
0.3137 0.4737 14800 0.4797 20049328
0.4268 0.4801 15000 0.4798 20319488
0.5074 0.4865 15200 0.4789 20589760
0.5278 0.4929 15400 0.4764 20860624
0.4903 0.4993 15600 0.4754 21133104
0.5416 0.5057 15800 0.4756 21403072
0.3926 0.5121 16000 0.4761 21675712
0.4037 0.5185 16200 0.4750 21946528
0.4913 0.5249 16400 0.4724 22217936
0.4442 0.5313 16600 0.4722 22489168
0.3534 0.5377 16800 0.4720 22759200
0.4472 0.5441 17000 0.4709 23028128
0.3981 0.5505 17200 0.4705 23300528
0.4462 0.5569 17400 0.4684 23569728
0.4062 0.5633 17600 0.4680 23838464
0.3186 0.5697 17800 0.4668 24109808
0.4891 0.5761 18000 0.4663 24380336
0.458 0.5825 18200 0.4661 24653072
0.4189 0.5890 18400 0.4657 24924912
0.4599 0.5954 18600 0.4661 25196400
0.3823 0.6018 18800 0.4650 25468816
0.3631 0.6082 19000 0.4641 25741776
0.6834 0.6146 19200 0.4637 26017088
0.3855 0.6210 19400 0.4627 26286480
0.4292 0.6274 19600 0.4638 26557200
0.437 0.6338 19800 0.4627 26827696
0.3012 0.6402 20000 0.4618 27098112
0.3044 0.6466 20200 0.4609 27369984
0.5599 0.6530 20400 0.4598 27640768
0.3936 0.6594 20600 0.4592 27910480
0.4015 0.6658 20800 0.4587 28180240
0.5022 0.6722 21000 0.4579 28451984
0.3381 0.6786 21200 0.4577 28723904
0.6385 0.6850 21400 0.4576 28994096
0.5204 0.6914 21600 0.4570 29267904
0.3454 0.6978 21800 0.4570 29540768
0.4744 0.7042 22000 0.4565 29812480
0.3103 0.7106 22200 0.4558 30080624
0.5805 0.7170 22400 0.4556 30352256
0.4824 0.7234 22600 0.4552 30622032
0.3745 0.7298 22800 0.4549 30894016
0.5018 0.7362 23000 0.4545 31162736
0.4904 0.7426 23200 0.4541 31433344
0.5793 0.7490 23400 0.4533 31708288
0.5206 0.7554 23600 0.4534 31982128
0.4382 0.7618 23800 0.4533 32253040
0.464 0.7682 24000 0.4531 32524464
0.4827 0.7746 24200 0.4527 32794928
0.5373 0.7810 24400 0.4521 33067904
0.3557 0.7874 24600 0.4519 33336480
0.4961 0.7938 24800 0.4517 33606096
0.6283 0.8002 25000 0.4515 33878720
0.3892 0.8066 25200 0.4513 34148496
0.4803 0.8130 25400 0.4518 34421392
0.4706 0.8194 25600 0.4511 34692880
0.421 0.8258 25800 0.4504 34964656
0.5967 0.8322 26000 0.4504 35234256
0.4502 0.8386 26200 0.4503 35504864
0.5948 0.8450 26400 0.4500 35777296
0.3845 0.8514 26600 0.4494 36045376
0.5572 0.8578 26800 0.4491 36315872
0.5925 0.8642 27000 0.4487 36590336
0.3107 0.8706 27200 0.4486 36858080
0.441 0.8770 27400 0.4483 37125216
0.5021 0.8834 27600 0.4479 37397648
0.4081 0.8898 27800 0.4479 37667456
0.4245 0.8962 28000 0.4474 37935760
0.4141 0.9026 28200 0.4473 38204832
0.4092 0.9090 28400 0.4469 38475552
0.3323 0.9154 28600 0.4469 38746560
0.5035 0.9218 28800 0.4468 39016288
0.3608 0.9282 29000 0.4470 39287360
0.4684 0.9346 29200 0.4466 39557440
0.2902 0.9410 29400 0.4463 39830256
0.4933 0.9474 29600 0.4463 40102464
0.2688 0.9538 29800 0.4461 40371968
0.3888 0.9602 30000 0.4462 40643632
0.4356 0.9666 30200 0.4461 40914064
0.419 0.9730 30400 0.4460 41182128
0.3897 0.9795 30600 0.4457 41452688
0.4594 0.9859 30800 0.4458 41721056
0.3648 0.9923 31000 0.4454 41993584
0.4395 0.9987 31200 0.4456 42266304
0.3717 1.0051 31400 0.4454 42536720
0.3335 1.0115 31600 0.4457 42810528
0.603 1.0179 31800 0.4452 43081488
0.3702 1.0243 32000 0.4454 43351904
0.67 1.0307 32200 0.4450 43622640
0.4796 1.0371 32400 0.4450 43893856
0.2008 1.0435 32600 0.4449 44164592
0.4444 1.0499 32800 0.4447 44438640
0.467 1.0563 33000 0.4448 44712640
0.3529 1.0627 33200 0.4446 44980912
0.5797 1.0691 33400 0.4447 45251328
0.5556 1.0755 33600 0.4447 45523792
0.3862 1.0819 33800 0.4445 45796960
0.389 1.0883 34000 0.4444 46067712
0.3956 1.0947 34200 0.4443 46337408
0.5701 1.1011 34400 0.4443 46611232
0.4403 1.1075 34600 0.4443 46879824
0.5335 1.1139 34800 0.4441 47155008
0.339 1.1203 35000 0.4442 47426864
0.4668 1.1267 35200 0.4441 47698224
0.5801 1.1331 35400 0.4439 47967840
0.4932 1.1395 35600 0.4442 48239792
0.4009 1.1459 35800 0.4441 48514752
0.6396 1.1523 36000 0.4441 48783136
0.4138 1.1587 36200 0.4441 49052640
0.341 1.1651 36400 0.4439 49321648
0.3171 1.1715 36600 0.4440 49592352
0.404 1.1779 36800 0.4440 49863184
0.4588 1.1843 37000 0.4440 50135184
0.4547 1.1907 37200 0.4441 50407568
0.7914 1.1971 37400 0.4440 50678192
0.4438 1.2035 37600 0.4441 50953312
0.4227 1.2099 37800 0.4440 51223392
0.287 1.2163 38000 0.4440 51491824
0.3409 1.2227 38200 0.4439 51763040
0.5219 1.2291 38400 0.4439 52033392
0.4185 1.2355 38600 0.4439 52304608
0.6725 1.2419 38800 0.4441 52574352
0.3928 1.2483 39000 0.4442 52846048
0.3627 1.2547 39200 0.4438 53118576
0.3919 1.2611 39400 0.4439 53387872
0.3773 1.2675 39600 0.4440 53659856
0.4209 1.2739 39800 0.4440 53928784
0.3773 1.2803 40000 0.4440 54198768

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_1745950250

Adapter
(2098)
this model

Evaluation results