train_record_1745950247

This model is a fine-tuned version of google/gemma-3-1b-it on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4941
  • Num Input Tokens Seen: 55002224

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.3
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
1.2371 0.0064 200 1.5827 277264
1.3444 0.0128 400 1.2721 548976
1.5009 0.0192 600 1.2182 826016
0.8678 0.0256 800 1.1834 1099968
1.1076 0.0320 1000 1.0605 1374672
0.9496 0.0384 1200 1.0762 1647936
1.0765 0.0448 1400 0.9992 1921648
0.8033 0.0512 1600 1.0182 2194448
0.9584 0.0576 1800 1.1629 2472048
1.1006 0.0640 2000 0.9536 2746752
1.0743 0.0704 2200 0.8952 3020144
1.0446 0.0768 2400 0.8950 3296624
0.9012 0.0832 2600 0.9015 3571808
1.1622 0.0896 2800 0.8872 3847184
0.9041 0.0960 3000 0.8430 4121024
0.8787 0.1024 3200 0.8599 4396880
0.8314 0.1088 3400 0.8541 4671152
0.7955 0.1152 3600 0.8506 4950800
0.7693 0.1216 3800 0.8173 5228512
0.8709 0.1280 4000 0.8573 5504608
0.9853 0.1344 4200 0.8288 5778176
1.0905 0.1408 4400 0.8487 6055712
0.846 0.1472 4600 0.8267 6331680
0.7685 0.1536 4800 0.8237 6604544
0.8457 0.1600 5000 0.8814 6882256
0.9768 0.1664 5200 0.7992 7159072
0.8985 0.1728 5400 0.7898 7433136
0.9597 0.1792 5600 0.7926 7707776
0.7812 0.1856 5800 0.7972 7985472
0.8175 0.1920 6000 0.8064 8259552
0.8294 0.1985 6200 0.7868 8535952
0.7955 0.2049 6400 0.7671 8809968
0.8651 0.2113 6600 0.7742 9084016
0.8679 0.2177 6800 0.7769 9357456
0.7892 0.2241 7000 0.7837 9630608
0.7311 0.2305 7200 0.7955 9907888
0.7695 0.2369 7400 0.7727 10182048
0.9169 0.2433 7600 0.7677 10458544
0.7497 0.2497 7800 0.7552 10736144
0.8553 0.2561 8000 0.7171 11010512
0.6293 0.2625 8200 0.6951 11284128
0.6334 0.2689 8400 0.6879 11556816
0.8279 0.2753 8600 0.7055 11828816
0.6041 0.2817 8800 0.6626 12104176
0.6143 0.2881 9000 0.6640 12378784
0.8597 0.2945 9200 0.6497 12654368
0.6403 0.3009 9400 0.6553 12927088
0.6487 0.3073 9600 0.6484 13199552
0.5817 0.3137 9800 0.6355 13473952
0.6924 0.3201 10000 0.6559 13750288
0.5349 0.3265 10200 0.6239 14025248
0.556 0.3329 10400 0.6419 14300160
0.5425 0.3393 10600 0.6385 14577760
0.6794 0.3457 10800 0.6185 14851280
0.5348 0.3521 11000 0.6385 15125104
0.6781 0.3585 11200 0.6159 15398624
0.8133 0.3649 11400 0.6244 15672384
0.5864 0.3713 11600 0.6171 15946384
0.3912 0.3777 11800 0.6202 16220112
0.6575 0.3841 12000 0.6192 16493920
0.5405 0.3905 12200 0.5922 16771376
0.7828 0.3969 12400 0.5894 17046656
0.4658 0.4033 12600 0.5989 17318272
0.7755 0.4097 12800 0.6354 17591696
0.5444 0.4161 13000 0.6029 17864256
0.5456 0.4225 13200 0.5986 18137984
0.7185 0.4289 13400 0.5926 18413504
0.6642 0.4353 13600 0.5815 18690528
0.6459 0.4417 13800 0.5842 18966352
0.5672 0.4481 14000 0.5744 19242160
0.5367 0.4545 14200 0.5751 19518832
0.5516 0.4609 14400 0.5729 19795920
0.5055 0.4673 14600 0.5889 20073168
0.399 0.4737 14800 0.5754 20349056
0.5499 0.4801 15000 0.5825 20622896
0.5207 0.4865 15200 0.5702 20896768
0.5521 0.4929 15400 0.5712 21171376
0.5371 0.4993 15600 0.5752 21447568
0.6351 0.5057 15800 0.5704 21722256
0.5437 0.5121 16000 0.5732 21998320
0.5628 0.5185 16200 0.5639 22273616
0.626 0.5249 16400 0.5587 22549280
0.5518 0.5313 16600 0.5643 22823984
0.4703 0.5377 16800 0.5626 23098384
0.5579 0.5441 17000 0.5629 23371136
0.4205 0.5505 17200 0.5602 23647856
0.5438 0.5569 17400 0.5586 23921008
0.4923 0.5633 17600 0.5529 24194480
0.4893 0.5697 17800 0.5554 24469312
0.5432 0.5761 18000 0.5534 24743360
0.5548 0.5825 18200 0.5488 25020352
0.5781 0.5890 18400 0.5530 25295920
0.5579 0.5954 18600 0.5463 25571232
0.693 0.6018 18800 0.5611 25847664
0.6779 0.6082 19000 0.5611 26125328
0.6462 0.6146 19200 0.5527 26404064
0.4534 0.6210 19400 0.5456 26677504
0.5426 0.6274 19600 0.5539 26952544
0.5724 0.6338 19800 0.5427 27226896
0.3026 0.6402 20000 0.5432 27501216
0.5014 0.6466 20200 0.5492 27776624
0.5602 0.6530 20400 0.5473 28051872
0.6304 0.6594 20600 0.5416 28325632
0.5525 0.6658 20800 0.5506 28598784
0.5865 0.6722 21000 0.5343 28874800
0.5382 0.6786 21200 0.5439 29151312
0.6467 0.6850 21400 0.5411 29425936
0.565 0.6914 21600 0.5407 29702784
0.6301 0.6978 21800 0.5329 29979824
0.4458 0.7042 22000 0.5339 30256128
0.452 0.7106 22200 0.5392 30528032
0.608 0.7170 22400 0.5369 30803904
0.7056 0.7234 22600 0.5329 31077632
0.5536 0.7298 22800 0.5303 31354544
0.594 0.7362 23000 0.5272 31626736
0.4026 0.7426 23200 0.5333 31901472
0.4775 0.7490 23400 0.5251 32179968
0.8036 0.7554 23600 0.5234 32457728
0.5097 0.7618 23800 0.5252 32732288
0.3575 0.7682 24000 0.5280 33007504
0.6269 0.7746 24200 0.5304 33281968
0.6274 0.7810 24400 0.5320 33558736
0.5539 0.7874 24600 0.5233 33830832
0.5876 0.7938 24800 0.5243 34104944
0.5314 0.8002 25000 0.5217 34381536
0.5158 0.8066 25200 0.5267 34654672
0.7053 0.8130 25400 0.5217 34931520
0.4934 0.8194 25600 0.5183 35206448
0.4797 0.8258 25800 0.5185 35482800
0.5238 0.8322 26000 0.5199 35756816
0.4021 0.8386 26200 0.5175 36031296
0.7042 0.8450 26400 0.5225 36307968
0.6731 0.8514 26600 0.5187 36580432
0.5761 0.8578 26800 0.5182 36855328
0.616 0.8642 27000 0.5133 37133072
0.3624 0.8706 27200 0.5141 37404464
0.5559 0.8770 27400 0.5117 37675456
0.4689 0.8834 27600 0.5123 37951616
0.5883 0.8898 27800 0.5095 38225840
0.4924 0.8962 28000 0.5094 38498736
0.5541 0.9026 28200 0.5092 38771760
0.4895 0.9090 28400 0.5091 39045824
0.3437 0.9154 28600 0.5097 39320736
0.4957 0.9218 28800 0.5113 39594816
0.5137 0.9282 29000 0.5090 39870432
0.5425 0.9346 29200 0.5091 40144672
0.4358 0.9410 29400 0.5062 40420752
0.4746 0.9474 29600 0.5064 40696672
0.3488 0.9538 29800 0.5055 40970096
0.5119 0.9602 30000 0.5056 41245904
0.6609 0.9666 30200 0.5052 41519232
0.5844 0.9730 30400 0.5048 41791520
0.4285 0.9795 30600 0.5039 42066928
0.56 0.9859 30800 0.5050 42339616
0.5263 0.9923 31000 0.5027 42616352
0.6078 0.9987 31200 0.5032 42892688
0.4005 1.0051 31400 0.5013 43167792
0.3178 1.0115 31600 0.5009 43444592
0.5995 1.0179 31800 0.4998 43719328
0.4926 1.0243 32000 0.4999 43994064
0.5995 1.0307 32200 0.4990 44269712
0.5196 1.0371 32400 0.4984 44545408
0.3068 1.0435 32600 0.4980 44819808
0.4889 1.0499 32800 0.4972 45097904
0.4359 1.0563 33000 0.4977 45376272
0.4718 1.0627 33200 0.4974 45647824
0.536 1.0691 33400 0.4979 45922032
0.5677 1.0755 33600 0.4972 46197840
0.4757 1.0819 33800 0.4974 46474848
0.5645 1.0883 34000 0.4970 46749824
0.5164 1.0947 34200 0.4962 47023856
0.4217 1.1011 34400 0.4973 47301520
0.5314 1.1075 34600 0.4966 47574864
0.7575 1.1139 34800 0.4967 47853888
0.3661 1.1203 35000 0.4956 48129792
0.5163 1.1267 35200 0.4956 48405024
0.3735 1.1331 35400 0.4951 48678592
0.3249 1.1395 35600 0.4948 48954048
0.3284 1.1459 35800 0.4949 49232480
0.6086 1.1523 36000 0.4946 49505040
0.4185 1.1587 36200 0.4948 49778864
0.4373 1.1651 36400 0.4945 50051632
0.3587 1.1715 36600 0.4946 50325888
0.456 1.1779 36800 0.4945 50601136
0.4731 1.1843 37000 0.4947 50876992
0.5226 1.1907 37200 0.4948 51153296
0.5659 1.1971 37400 0.4949 51427552
0.5782 1.2035 37600 0.4945 51707088
0.4875 1.2099 37800 0.4944 51981712
0.3913 1.2163 38000 0.4942 52254352
0.4959 1.2227 38200 0.4944 52529584
0.7156 1.2291 38400 0.4943 52803776
0.4096 1.2355 38600 0.4942 53078736
0.5082 1.2419 38800 0.4941 53352672
0.3781 1.2483 39000 0.4943 53628768
0.4386 1.2547 39200 0.4942 53905216
0.4696 1.2611 39400 0.4943 54178832
0.5154 1.2675 39600 0.4943 54454880
0.4841 1.2739 39800 0.4943 54727600
0.4098 1.2803 40000 0.4942 55002224

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_1745950247

Adapter
(159)
this model

Evaluation results