100M_high_100_6910

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.3078
  • Accuracy: 0.3941

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0006
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 6910
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.98) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
5.1033 0.1076 1000 5.0180 0.2274
4.5988 0.2153 2000 4.5230 0.2689
4.3355 0.3229 3000 4.2498 0.2975
4.1754 0.4305 4000 4.1035 0.3111
4.0607 0.5382 5000 3.9984 0.3212
4.0022 0.6458 6000 3.9244 0.3273
3.9275 0.7534 7000 3.8683 0.3325
3.8874 0.8610 8000 3.8202 0.3373
3.8623 0.9687 9000 3.7831 0.3403
3.7478 1.0763 10000 3.7542 0.3438
3.7484 1.1839 11000 3.7274 0.3462
3.731 1.2916 12000 3.7030 0.3489
3.7217 1.3992 13000 3.6806 0.3509
3.7223 1.5068 14000 3.6594 0.3529
3.6733 1.6145 15000 3.6423 0.3546
3.6659 1.7221 16000 3.6230 0.3564
3.6659 1.8297 17000 3.6099 0.3582
3.6405 1.9374 18000 3.5942 0.3595
3.5552 2.0450 19000 3.5863 0.3607
3.5517 2.1526 20000 3.5759 0.3619
3.5622 2.2603 21000 3.5652 0.3634
3.5521 2.3679 22000 3.5536 0.3646
3.5272 2.4755 23000 3.5438 0.3653
3.5408 2.5831 24000 3.5349 0.3664
3.5384 2.6908 25000 3.5265 0.3676
3.5453 2.7984 26000 3.5167 0.3680
3.5254 2.9060 27000 3.5096 0.3690
3.4289 3.0137 28000 3.5023 0.3698
3.4573 3.1213 29000 3.5012 0.3706
3.4612 3.2289 30000 3.4919 0.3712
3.4596 3.3366 31000 3.4854 0.3723
3.4761 3.4442 32000 3.4807 0.3726
3.4718 3.5518 33000 3.4740 0.3732
3.4625 3.6595 34000 3.4664 0.3741
3.4482 3.7671 35000 3.4618 0.3749
3.4544 3.8747 36000 3.4568 0.3747
3.4332 3.9823 37000 3.4505 0.3760
3.3729 4.0900 38000 3.4507 0.3762
3.3957 4.1976 39000 3.4467 0.3766
3.4135 4.3052 40000 3.4434 0.3771
3.4065 4.4129 41000 3.4378 0.3779
3.3837 4.5205 42000 3.4326 0.3780
3.3988 4.6281 43000 3.4265 0.3786
3.4003 4.7358 44000 3.4218 0.3791
3.3718 4.8434 45000 3.4193 0.3795
3.3876 4.9510 46000 3.4116 0.3805
3.3072 5.0587 47000 3.4152 0.3806
3.3312 5.1663 48000 3.4141 0.3808
3.33 5.2739 49000 3.4092 0.3810
3.3217 5.3816 50000 3.4062 0.3815
3.3232 5.4892 51000 3.4009 0.3818
3.3414 5.5968 52000 3.3978 0.3823
3.3284 5.7044 53000 3.3928 0.3824
3.3423 5.8121 54000 3.3889 0.3831
3.3375 5.9197 55000 3.3830 0.3838
3.2457 6.0273 56000 3.3864 0.3836
3.2743 6.1350 57000 3.3878 0.3837
3.278 6.2426 58000 3.3834 0.3845
3.2893 6.3502 59000 3.3806 0.3846
3.2858 6.4579 60000 3.3779 0.3851
3.2612 6.5655 61000 3.3731 0.3855
3.2874 6.6731 62000 3.3683 0.3860
3.2913 6.7808 63000 3.3637 0.3862
3.2874 6.8884 64000 3.3623 0.3865
3.3006 6.9960 65000 3.3566 0.3871
3.2113 7.1036 66000 3.3637 0.3868
3.2365 7.2113 67000 3.3613 0.3871
3.2464 7.3189 68000 3.3587 0.3875
3.2246 7.4265 69000 3.3527 0.3879
3.2492 7.5342 70000 3.3493 0.3885
3.2229 7.6418 71000 3.3436 0.3890
3.2475 7.7494 72000 3.3427 0.3891
3.2372 7.8571 73000 3.3380 0.3897
3.256 7.9647 74000 3.3354 0.3900
3.1567 8.0723 75000 3.3419 0.3896
3.1763 8.1800 76000 3.3383 0.3903
3.1703 8.2876 77000 3.3362 0.3905
3.1765 8.3952 78000 3.3342 0.3905
3.1938 8.5029 79000 3.3285 0.3911
3.167 8.6105 80000 3.3272 0.3913
3.208 8.7181 81000 3.3216 0.3917
3.1722 8.8257 82000 3.3193 0.3921
3.1905 8.9334 83000 3.3167 0.3924
3.1372 9.0410 84000 3.3181 0.3925
3.1427 9.1486 85000 3.3179 0.3927
3.1571 9.2563 86000 3.3175 0.3929
3.1387 9.3639 87000 3.3142 0.3933
3.1296 9.4715 88000 3.3115 0.3935
3.1449 9.5792 89000 3.3079 0.3939
3.126 9.6868 90000 3.3078 0.3941
3.1172 9.7944 91000 3.3053 0.3943
3.1328 9.9021 92000 3.3035 0.3944

Framework versions

  • Transformers 4.47.0.dev0
  • Pytorch 2.5.0+cu124
  • Datasets 3.0.2
  • Tokenizers 0.20.1
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including craa/100M_high_100_6910