train_record_1745950248

This model is a fine-tuned version of google/gemma-3-1b-it on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4147
  • Num Input Tokens Seen: 55002224

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7637 0.0064 200 0.8309 277264
0.7599 0.0128 400 0.7662 548976
0.8776 0.0192 600 0.7144 826016
0.6153 0.0256 800 0.7095 1099968
0.6966 0.0320 1000 0.6844 1374672
0.7239 0.0384 1200 0.6842 1647936
0.7858 0.0448 1400 0.6562 1921648
0.6263 0.0512 1600 0.6336 2194448
0.5621 0.0576 1800 0.6219 2472048
0.7004 0.0640 2000 0.6286 2746752
0.7971 0.0704 2200 0.6086 3020144
0.7552 0.0768 2400 0.5973 3296624
0.5672 0.0832 2600 0.5917 3571808
0.6317 0.0896 2800 0.5845 3847184
0.4644 0.0960 3000 0.5721 4121024
0.5244 0.1024 3200 0.5722 4396880
0.4422 0.1088 3400 0.5642 4671152
0.5486 0.1152 3600 0.5604 4950800
0.5977 0.1216 3800 0.5534 5228512
0.5191 0.1280 4000 0.5563 5504608
0.5778 0.1344 4200 0.5604 5778176
0.6491 0.1408 4400 0.5553 6055712
0.4658 0.1472 4600 0.5441 6331680
0.4587 0.1536 4800 0.5353 6604544
0.4674 0.1600 5000 0.5551 6882256
0.5196 0.1664 5200 0.5266 7159072
0.5283 0.1728 5400 0.5213 7433136
0.5664 0.1792 5600 0.5254 7707776
0.5205 0.1856 5800 0.5283 7985472
0.5836 0.1920 6000 0.5284 8259552
0.8333 0.1985 6200 0.5210 8535952
0.4692 0.2049 6400 0.5177 8809968
0.7781 0.2113 6600 0.5086 9084016
0.5772 0.2177 6800 0.5105 9357456
0.3953 0.2241 7000 0.5147 9630608
0.4812 0.2305 7200 0.5101 9907888
0.3953 0.2369 7400 0.5119 10182048
0.5148 0.2433 7600 0.5055 10458544
0.4117 0.2497 7800 0.5008 10736144
0.5146 0.2561 8000 0.5020 11010512
0.4996 0.2625 8200 0.5003 11284128
0.3909 0.2689 8400 0.5028 11556816
0.6982 0.2753 8600 0.4970 11828816
0.524 0.2817 8800 0.4928 12104176
0.5442 0.2881 9000 0.5028 12378784
0.612 0.2945 9200 0.4856 12654368
0.38 0.3009 9400 0.4887 12927088
0.459 0.3073 9600 0.4980 13199552
0.3959 0.3137 9800 0.4865 13473952
0.5858 0.3201 10000 0.4921 13750288
0.5998 0.3265 10200 0.4869 14025248
0.3912 0.3329 10400 0.4884 14300160
0.5334 0.3393 10600 0.4912 14577760
0.498 0.3457 10800 0.4882 14851280
0.4529 0.3521 11000 0.4823 15125104
0.5125 0.3585 11200 0.4843 15398624
0.6543 0.3649 11400 0.4775 15672384
0.4496 0.3713 11600 0.4766 15946384
0.3588 0.3777 11800 0.4742 16220112
0.5176 0.3841 12000 0.4816 16493920
0.5187 0.3905 12200 0.4722 16771376
0.6559 0.3969 12400 0.4672 17046656
0.3598 0.4033 12600 0.4686 17318272
0.5275 0.4097 12800 0.4729 17591696
0.4361 0.4161 13000 0.4685 17864256
0.2554 0.4225 13200 0.4721 18137984
0.4917 0.4289 13400 0.4656 18413504
0.5701 0.4353 13600 0.4712 18690528
0.5019 0.4417 13800 0.4629 18966352
0.3996 0.4481 14000 0.4658 19242160
0.457 0.4545 14200 0.4627 19518832
0.3924 0.4609 14400 0.4648 19795920
0.4476 0.4673 14600 0.4612 20073168
0.3298 0.4737 14800 0.4619 20349056
0.3888 0.4801 15000 0.4667 20622896
0.3864 0.4865 15200 0.4592 20896768
0.5478 0.4929 15400 0.4594 21171376
0.4035 0.4993 15600 0.4521 21447568
0.4274 0.5057 15800 0.4600 21722256
0.5038 0.5121 16000 0.4640 21998320
0.368 0.5185 16200 0.4569 22273616
0.3905 0.5249 16400 0.4536 22549280
0.3819 0.5313 16600 0.4502 22823984
0.3961 0.5377 16800 0.4536 23098384
0.4351 0.5441 17000 0.4484 23371136
0.3499 0.5505 17200 0.4531 23647856
0.4212 0.5569 17400 0.4478 23921008
0.3018 0.5633 17600 0.4493 24194480
0.3454 0.5697 17800 0.4463 24469312
0.6067 0.5761 18000 0.4515 24743360
0.5044 0.5825 18200 0.4462 25020352
0.3896 0.5890 18400 0.4433 25295920
0.3089 0.5954 18600 0.4401 25571232
0.6157 0.6018 18800 0.4480 25847664
0.6086 0.6082 19000 0.4437 26125328
0.5783 0.6146 19200 0.4426 26404064
0.305 0.6210 19400 0.4401 26677504
0.367 0.6274 19600 0.4471 26952544
0.5247 0.6338 19800 0.4434 27226896
0.1978 0.6402 20000 0.4404 27501216
0.4885 0.6466 20200 0.4372 27776624
0.437 0.6530 20400 0.4396 28051872
0.4365 0.6594 20600 0.4348 28325632
0.3291 0.6658 20800 0.4470 28598784
0.4128 0.6722 21000 0.4337 28874800
0.4799 0.6786 21200 0.4427 29151312
0.5341 0.6850 21400 0.4459 29425936
0.4091 0.6914 21600 0.4405 29702784
0.5556 0.6978 21800 0.4350 29979824
0.3903 0.7042 22000 0.4314 30256128
0.3614 0.7106 22200 0.4372 30528032
0.4857 0.7170 22400 0.4370 30803904
0.5741 0.7234 22600 0.4326 31077632
0.4104 0.7298 22800 0.4351 31354544
0.5615 0.7362 23000 0.4328 31626736
0.3213 0.7426 23200 0.4317 31901472
0.3883 0.7490 23400 0.4293 32179968
0.5363 0.7554 23600 0.4274 32457728
0.5532 0.7618 23800 0.4292 32732288
0.2762 0.7682 24000 0.4324 33007504
0.5548 0.7746 24200 0.4323 33281968
0.5527 0.7810 24400 0.4257 33558736
0.5945 0.7874 24600 0.4350 33830832
0.5186 0.7938 24800 0.4257 34104944
0.5345 0.8002 25000 0.4238 34381536
0.2833 0.8066 25200 0.4255 34654672
0.5015 0.8130 25400 0.4238 34931520
0.423 0.8194 25600 0.4240 35206448
0.3526 0.8258 25800 0.4248 35482800
0.337 0.8322 26000 0.4237 35756816
0.2963 0.8386 26200 0.4268 36031296
0.6279 0.8450 26400 0.4256 36307968
0.3503 0.8514 26600 0.4221 36580432
0.5573 0.8578 26800 0.4257 36855328
0.5261 0.8642 27000 0.4218 37133072
0.3644 0.8706 27200 0.4206 37404464
0.2889 0.8770 27400 0.4232 37675456
0.4124 0.8834 27600 0.4196 37951616
0.4655 0.8898 27800 0.4180 38225840
0.3642 0.8962 28000 0.4175 38498736
0.4399 0.9026 28200 0.4183 38771760
0.4757 0.9090 28400 0.4189 39045824
0.2641 0.9154 28600 0.4176 39320736
0.4197 0.9218 28800 0.4200 39594816
0.2908 0.9282 29000 0.4174 39870432
0.4782 0.9346 29200 0.4169 40144672
0.3583 0.9410 29400 0.4177 40420752
0.4362 0.9474 29600 0.4174 40696672
0.2764 0.9538 29800 0.4160 40970096
0.3663 0.9602 30000 0.4204 41245904
0.603 0.9666 30200 0.4178 41519232
0.5876 0.9730 30400 0.4194 41791520
0.3762 0.9795 30600 0.4147 42066928
0.5234 0.9859 30800 0.4166 42339616
0.4381 0.9923 31000 0.4176 42616352
0.3691 0.9987 31200 0.4167 42892688
0.3008 1.0051 31400 0.4198 43167792
0.2938 1.0115 31600 0.4211 43444592
0.4135 1.0179 31800 0.4223 43719328
0.3631 1.0243 32000 0.4220 43994064
0.4101 1.0307 32200 0.4212 44269712
0.4376 1.0371 32400 0.4213 44545408
0.1759 1.0435 32600 0.4186 44819808
0.4014 1.0499 32800 0.4202 45097904
0.2629 1.0563 33000 0.4195 45376272
0.253 1.0627 33200 0.4193 45647824
0.3565 1.0691 33400 0.4191 45922032
0.4038 1.0755 33600 0.4166 46197840
0.3745 1.0819 33800 0.4196 46474848
0.6211 1.0883 34000 0.4197 46749824
0.3364 1.0947 34200 0.4192 47023856
0.297 1.1011 34400 0.4198 47301520
0.4165 1.1075 34600 0.4196 47574864
0.6202 1.1139 34800 0.4198 47853888
0.3048 1.1203 35000 0.4190 48129792
0.4307 1.1267 35200 0.4192 48405024
0.2654 1.1331 35400 0.4193 48678592
0.2639 1.1395 35600 0.4191 48954048
0.2035 1.1459 35800 0.4188 49232480
0.4558 1.1523 36000 0.4190 49505040
0.2982 1.1587 36200 0.4191 49778864
0.2703 1.1651 36400 0.4189 50051632
0.284 1.1715 36600 0.4192 50325888
0.3993 1.1779 36800 0.4192 50601136
0.3726 1.1843 37000 0.4190 50876992
0.4126 1.1907 37200 0.4187 51153296
0.4953 1.1971 37400 0.4185 51427552
0.3836 1.2035 37600 0.4184 51707088
0.3269 1.2099 37800 0.4188 51981712
0.2623 1.2163 38000 0.4186 52254352
0.497 1.2227 38200 0.4186 52529584
0.7036 1.2291 38400 0.4185 52803776
0.2963 1.2355 38600 0.4184 53078736
0.5005 1.2419 38800 0.4183 53352672
0.2293 1.2483 39000 0.4185 53628768
0.2732 1.2547 39200 0.4185 53905216
0.2926 1.2611 39400 0.4186 54178832
0.315 1.2675 39600 0.4184 54454880
0.3353 1.2739 39800 0.4185 54727600
0.2883 1.2803 40000 0.4185 55002224

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_1745950248

Adapter
(197)
this model