train_record_1745950253

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 4.4571
  • Num Input Tokens Seen: 54198768

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
3.4622 0.0064 200 4.9109 272992
4.4474 0.0128 400 4.6776 541536
4.3464 0.0192 600 4.5747 813648
4.3474 0.0256 800 4.5402 1084496
3.707 0.0320 1000 4.5457 1355472
4.4764 0.0384 1200 4.5150 1624048
4.6147 0.0448 1400 4.5161 1893968
3.7767 0.0512 1600 4.5053 2163024
4.0815 0.0576 1800 4.5199 2436032
4.5046 0.0640 2000 4.5114 2706960
6.1891 0.0704 2200 4.5107 2976144
3.7462 0.0768 2400 4.5104 3248384
4.6499 0.0832 2600 4.5035 3519088
4.3216 0.0896 2800 4.5189 3790208
4.4919 0.0960 3000 4.5074 4059472
3.3845 0.1024 3200 4.5056 4331088
4.6709 0.1088 3400 4.5079 4601728
3.8271 0.1152 3600 4.4911 4877104
4.429 0.1216 3800 4.4881 5150656
3.6511 0.1280 4000 4.4951 5422944
5.3519 0.1344 4200 4.4855 5692368
4.2255 0.1408 4400 4.4981 5965440
4.9348 0.1472 4600 4.4977 6237632
3.4591 0.1536 4800 4.5017 6506256
4.3076 0.1600 5000 4.4971 6779376
4.1459 0.1664 5200 4.5047 7051504
3.943 0.1728 5400 4.4887 7321552
3.7287 0.1792 5600 4.4714 7592304
4.5494 0.1856 5800 4.4840 7865632
3.8875 0.1920 6000 4.4906 8135936
4.7084 0.1985 6200 4.4994 8408624
3.6507 0.2049 6400 4.4853 8677888
4.6074 0.2113 6600 4.4882 8947120
3.9965 0.2177 6800 4.4977 9216336
3.3807 0.2241 7000 4.4911 9485568
4.1261 0.2305 7200 4.4915 9758160
4.3934 0.2369 7400 4.4811 10028256
5.2692 0.2433 7600 4.4935 10300544
4.0686 0.2497 7800 4.4813 10574192
4.6755 0.2561 8000 4.4925 10844928
3.8404 0.2625 8200 4.4825 11114800
4.5697 0.2689 8400 4.4839 11383280
5.3488 0.2753 8600 4.4747 11652336
4.5469 0.2817 8800 4.4927 11924224
3.6975 0.2881 9000 4.4993 12194800
3.8447 0.2945 9200 4.4795 12466288
4.2442 0.3009 9400 4.4930 12735104
4.1695 0.3073 9600 4.4912 13003216
4.1089 0.3137 9800 4.4999 13273680
5.1543 0.3201 10000 4.5051 13545840
4.2356 0.3265 10200 4.4709 13817104
4.757 0.3329 10400 4.4854 14088032
5.2015 0.3393 10600 4.4966 14361280
4.8125 0.3457 10800 4.4881 14631040
3.7125 0.3521 11000 4.4837 14901648
4.6481 0.3585 11200 4.4879 15170800
4.8716 0.3649 11400 4.5011 15440592
6.5712 0.3713 11600 4.4706 15710608
4.0938 0.3777 11800 4.4764 15980176
4.7581 0.3841 12000 4.4922 16249072
4.3731 0.3905 12200 4.4919 16522704
4.5962 0.3969 12400 4.4913 16794064
3.8551 0.4033 12600 4.4911 17062288
4.5508 0.4097 12800 4.4974 17331072
4.2258 0.4161 13000 4.5005 17599616
3.7439 0.4225 13200 4.4908 17869424
3.3628 0.4289 13400 4.5124 18141136
3.783 0.4353 13600 4.4945 18414272
5.2144 0.4417 13800 4.5006 18685264
4.1907 0.4481 14000 4.4800 18957072
3.1513 0.4545 14200 4.4922 19230480
3.8682 0.4609 14400 4.4971 19503472
4.9699 0.4673 14600 4.4959 19777344
3.1368 0.4737 14800 4.4740 20049328
4.6479 0.4801 15000 4.4963 20319488
4.9356 0.4865 15200 4.4824 20589760
4.012 0.4929 15400 4.4831 20860624
4.1237 0.4993 15600 4.4829 21133104
5.6766 0.5057 15800 4.4748 21403072
4.0869 0.5121 16000 4.4932 21675712
4.3651 0.5185 16200 4.4654 21946528
4.2341 0.5249 16400 4.4637 22217936
4.0066 0.5313 16600 4.4884 22489168
4.5365 0.5377 16800 4.4926 22759200
4.2455 0.5441 17000 4.4940 23028128
4.2426 0.5505 17200 4.4984 23300528
4.291 0.5569 17400 4.4735 23569728
4.815 0.5633 17600 4.5050 23838464
4.839 0.5697 17800 4.4982 24109808
4.8494 0.5761 18000 4.4909 24380336
4.3989 0.5825 18200 4.4916 24653072
3.9486 0.5890 18400 4.4571 24924912
4.4004 0.5954 18600 4.5108 25196400
4.237 0.6018 18800 4.4795 25468816
4.0165 0.6082 19000 4.4926 25741776
4.9951 0.6146 19200 4.4784 26017088
3.5959 0.6210 19400 4.4900 26286480
4.5629 0.6274 19600 4.4870 26557200
4.0744 0.6338 19800 4.4970 26827696
3.8765 0.6402 20000 4.4868 27098112
4.1175 0.6466 20200 4.4988 27369984
5.3047 0.6530 20400 4.4709 27640768
4.0005 0.6594 20600 4.4891 27910480
4.4139 0.6658 20800 4.4806 28180240
4.7914 0.6722 21000 4.4793 28451984
4.6492 0.6786 21200 4.4787 28723904
5.6937 0.6850 21400 4.4880 28994096
3.4293 0.6914 21600 4.4914 29267904
3.8018 0.6978 21800 4.4886 29540768
4.4407 0.7042 22000 4.4938 29812480
4.5343 0.7106 22200 4.5006 30080624
4.3453 0.7170 22400 4.5022 30352256
4.6245 0.7234 22600 4.4961 30622032
4.41 0.7298 22800 4.4869 30894016
4.678 0.7362 23000 4.4889 31162736
5.3991 0.7426 23200 4.4910 31433344
4.5956 0.7490 23400 4.4923 31708288
4.255 0.7554 23600 4.4687 31982128
4.7721 0.7618 23800 4.4848 32253040
3.9138 0.7682 24000 4.5084 32524464
3.8332 0.7746 24200 4.4894 32794928
3.9703 0.7810 24400 4.4828 33067904
4.0106 0.7874 24600 4.4984 33336480
4.3804 0.7938 24800 4.4618 33606096
4.2001 0.8002 25000 4.4941 33878720
4.4727 0.8066 25200 4.4828 34148496
4.79 0.8130 25400 4.5010 34421392
4.8489 0.8194 25600 4.4879 34692880
4.0376 0.8258 25800 4.4957 34964656
5.177 0.8322 26000 4.4864 35234256
4.6042 0.8386 26200 4.4710 35504864
4.626 0.8450 26400 4.5074 35777296
4.4373 0.8514 26600 4.4700 36045376
4.3166 0.8578 26800 4.4870 36315872
4.4594 0.8642 27000 4.4897 36590336
4.1458 0.8706 27200 4.4807 36858080
4.2357 0.8770 27400 4.5005 37125216
4.6963 0.8834 27600 4.4925 37397648
3.6524 0.8898 27800 4.5142 37667456
4.0705 0.8962 28000 4.4866 37935760
4.1514 0.9026 28200 4.4846 38204832
4.6624 0.9090 28400 4.5107 38475552
3.0155 0.9154 28600 4.4960 38746560
5.2439 0.9218 28800 4.4847 39016288
5.1229 0.9282 29000 4.4919 39287360
4.7314 0.9346 29200 4.4841 39557440
3.8531 0.9410 29400 4.4765 39830256
4.4226 0.9474 29600 4.4810 40102464
3.617 0.9538 29800 4.4892 40371968
3.6916 0.9602 30000 4.4873 40643632
3.6975 0.9666 30200 4.4882 40914064
4.0617 0.9730 30400 4.4910 41182128
4.0031 0.9795 30600 4.4909 41452688
4.8552 0.9859 30800 4.4880 41721056
3.6645 0.9923 31000 4.4843 41993584
3.9784 0.9987 31200 4.4770 42266304
3.6653 1.0051 31400 4.4780 42536720
5.1739 1.0115 31600 4.4908 42810528
4.4113 1.0179 31800 4.4806 43081488
4.0525 1.0243 32000 4.4966 43351904
4.5779 1.0307 32200 4.4996 43622640
4.4427 1.0371 32400 4.4902 43893856
3.1415 1.0435 32600 4.4777 44164592
3.9178 1.0499 32800 4.4812 44438640
4.1746 1.0563 33000 4.4796 44712640
4.973 1.0627 33200 4.4898 44980912
4.4304 1.0691 33400 4.4772 45251328
4.1773 1.0755 33600 4.4917 45523792
4.3373 1.0819 33800 4.4946 45796960
5.0644 1.0883 34000 4.4988 46067712
3.8478 1.0947 34200 4.4749 46337408
4.7971 1.1011 34400 4.4940 46611232
4.4257 1.1075 34600 4.4740 46879824
3.9397 1.1139 34800 4.4922 47155008
4.4246 1.1203 35000 4.5051 47426864
4.6721 1.1267 35200 4.4900 47698224
3.6621 1.1331 35400 4.4876 47967840
4.291 1.1395 35600 4.4749 48239792
4.2103 1.1459 35800 4.5065 48514752
4.6992 1.1523 36000 4.5041 48783136
4.4254 1.1587 36200 4.4957 49052640
5.1153 1.1651 36400 4.4957 49321648
3.9983 1.1715 36600 4.4960 49592352
3.6453 1.1779 36800 4.4960 49863184
4.6426 1.1843 37000 4.4960 50135184
4.5039 1.1907 37200 4.4960 50407568
6.2586 1.1971 37400 4.4960 50678192
4.6008 1.2035 37600 4.4960 50953312
4.0899 1.2099 37800 4.4960 51223392
4.6102 1.2163 38000 4.4960 51491824
5.2984 1.2227 38200 4.4960 51763040
3.8424 1.2291 38400 4.4960 52033392
4.7151 1.2355 38600 4.4960 52304608
4.5647 1.2419 38800 4.4960 52574352
4.2167 1.2483 39000 4.4960 52846048
4.1054 1.2547 39200 4.4960 53118576
3.7602 1.2611 39400 4.4960 53387872
3.4438 1.2675 39600 4.4960 53659856
4.9077 1.2739 39800 4.4960 53928784
4.6876 1.2803 40000 4.4960 54198768

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
15
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_1745950253

Adapter
(2098)
this model

Evaluation results