Configuration Parsing Warning: In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string

stlenc-distilled-v2

This model is a fine-tuned version of saracandu/stlenc-distilled-v2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3803

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss
3.3937 0.0651 250 3.2997
3.2115 0.1301 500 3.2767
3.2025 0.1952 750 3.2675
3.1722 0.2603 1000 3.2447
3.0839 0.3254 1250 3.2229
2.8488 0.3904 1500 3.1718
2.7056 0.4555 1750 3.0489
2.5888 0.5206 2000 2.9679
2.395 0.5856 2250 2.9154
2.3511 0.6507 2500 2.9001
2.1649 0.7158 2750 2.8829
1.9847 0.7808 3000 2.8114
1.8525 0.8459 3250 2.7843
1.7891 0.9110 3500 2.7517
1.7504 0.9761 3750 2.7173
1.6783 1.0411 4000 2.6863
1.6719 1.1062 4250 2.6576
1.6199 1.1713 4500 2.6408
1.6311 1.2363 4750 2.6177
1.6007 1.3014 5000 2.6086
1.5567 1.3665 5250 2.5955
1.5692 1.4315 5500 2.6041
1.5537 1.4966 5750 2.6022
1.5829 1.5617 6000 2.5976
1.5861 1.6268 6250 2.5696
1.5579 1.6918 6500 2.5567
1.5477 1.7569 6750 2.5426
1.5365 1.8220 7000 2.5333
1.5601 1.8870 7250 2.5394
1.517 1.9521 7500 2.5406
1.487 2.0172 7750 2.5245
1.5409 2.0822 8000 2.5258
1.5186 2.1473 8250 2.5243
1.4841 2.2124 8500 2.4957
1.5096 2.2775 8750 2.5002
1.4841 2.3425 9000 2.4821
1.4803 2.4076 9250 2.4835
1.4704 2.4727 9500 2.4936
1.4926 2.5377 9750 2.4756
1.4721 2.6028 10000 2.4734
1.4608 2.6679 10250 2.4701
1.4623 2.7330 10500 2.4784
1.46 2.7980 10750 2.4586
1.4624 2.8631 11000 2.4566
1.485 2.9282 11250 2.4747
1.4407 2.9932 11500 2.4524
1.4803 3.0583 11750 2.4479
1.4343 3.1234 12000 2.4495
1.4391 3.1884 12250 2.4430
1.4473 3.2535 12500 2.4361
1.431 3.3186 12750 2.4406
1.4179 3.3837 13000 2.4319
1.427 3.4487 13250 2.4221
1.4248 3.5138 13500 2.4345
1.4316 3.5789 13750 2.4209
1.4401 3.6439 14000 2.4184
1.4158 3.7090 14250 2.4106
1.4205 3.7741 14500 2.4091
1.4273 3.8391 14750 2.4081
1.4182 3.9042 15000 2.3990
1.4099 3.9693 15250 2.4105
1.4087 4.0344 15500 2.3989
1.3837 4.0994 15750 2.4094
1.4162 4.1645 16000 2.3978
1.3971 4.2296 16250 2.3848
1.3804 4.2946 16500 2.3963
1.374 4.3597 16750 2.3939
1.3999 4.4248 17000 2.3807
1.3926 4.4898 17250 2.3879
1.3823 4.5549 17500 2.3847
1.3768 4.6200 17750 2.3820
1.3572 4.6851 18000 2.3872
1.4148 4.7501 18250 2.3819
1.3667 4.8152 18500 2.3777
1.3943 4.8803 18750 2.3796
1.3851 4.9453 19000 2.3803

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.4.2
  • Tokenizers 0.22.1
Downloads last month
536
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for saracandu/stlenc-temp0.2

Unable to build the model tree, the base model loops to the model itself. Learn more.