Configuration Parsing
Warning:
In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string
stlenc-distilled-v2
This model is a fine-tuned version of saracandu/stlenc-distilled-v2 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.3803
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 5
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 3.3937 | 0.0651 | 250 | 3.2997 |
| 3.2115 | 0.1301 | 500 | 3.2767 |
| 3.2025 | 0.1952 | 750 | 3.2675 |
| 3.1722 | 0.2603 | 1000 | 3.2447 |
| 3.0839 | 0.3254 | 1250 | 3.2229 |
| 2.8488 | 0.3904 | 1500 | 3.1718 |
| 2.7056 | 0.4555 | 1750 | 3.0489 |
| 2.5888 | 0.5206 | 2000 | 2.9679 |
| 2.395 | 0.5856 | 2250 | 2.9154 |
| 2.3511 | 0.6507 | 2500 | 2.9001 |
| 2.1649 | 0.7158 | 2750 | 2.8829 |
| 1.9847 | 0.7808 | 3000 | 2.8114 |
| 1.8525 | 0.8459 | 3250 | 2.7843 |
| 1.7891 | 0.9110 | 3500 | 2.7517 |
| 1.7504 | 0.9761 | 3750 | 2.7173 |
| 1.6783 | 1.0411 | 4000 | 2.6863 |
| 1.6719 | 1.1062 | 4250 | 2.6576 |
| 1.6199 | 1.1713 | 4500 | 2.6408 |
| 1.6311 | 1.2363 | 4750 | 2.6177 |
| 1.6007 | 1.3014 | 5000 | 2.6086 |
| 1.5567 | 1.3665 | 5250 | 2.5955 |
| 1.5692 | 1.4315 | 5500 | 2.6041 |
| 1.5537 | 1.4966 | 5750 | 2.6022 |
| 1.5829 | 1.5617 | 6000 | 2.5976 |
| 1.5861 | 1.6268 | 6250 | 2.5696 |
| 1.5579 | 1.6918 | 6500 | 2.5567 |
| 1.5477 | 1.7569 | 6750 | 2.5426 |
| 1.5365 | 1.8220 | 7000 | 2.5333 |
| 1.5601 | 1.8870 | 7250 | 2.5394 |
| 1.517 | 1.9521 | 7500 | 2.5406 |
| 1.487 | 2.0172 | 7750 | 2.5245 |
| 1.5409 | 2.0822 | 8000 | 2.5258 |
| 1.5186 | 2.1473 | 8250 | 2.5243 |
| 1.4841 | 2.2124 | 8500 | 2.4957 |
| 1.5096 | 2.2775 | 8750 | 2.5002 |
| 1.4841 | 2.3425 | 9000 | 2.4821 |
| 1.4803 | 2.4076 | 9250 | 2.4835 |
| 1.4704 | 2.4727 | 9500 | 2.4936 |
| 1.4926 | 2.5377 | 9750 | 2.4756 |
| 1.4721 | 2.6028 | 10000 | 2.4734 |
| 1.4608 | 2.6679 | 10250 | 2.4701 |
| 1.4623 | 2.7330 | 10500 | 2.4784 |
| 1.46 | 2.7980 | 10750 | 2.4586 |
| 1.4624 | 2.8631 | 11000 | 2.4566 |
| 1.485 | 2.9282 | 11250 | 2.4747 |
| 1.4407 | 2.9932 | 11500 | 2.4524 |
| 1.4803 | 3.0583 | 11750 | 2.4479 |
| 1.4343 | 3.1234 | 12000 | 2.4495 |
| 1.4391 | 3.1884 | 12250 | 2.4430 |
| 1.4473 | 3.2535 | 12500 | 2.4361 |
| 1.431 | 3.3186 | 12750 | 2.4406 |
| 1.4179 | 3.3837 | 13000 | 2.4319 |
| 1.427 | 3.4487 | 13250 | 2.4221 |
| 1.4248 | 3.5138 | 13500 | 2.4345 |
| 1.4316 | 3.5789 | 13750 | 2.4209 |
| 1.4401 | 3.6439 | 14000 | 2.4184 |
| 1.4158 | 3.7090 | 14250 | 2.4106 |
| 1.4205 | 3.7741 | 14500 | 2.4091 |
| 1.4273 | 3.8391 | 14750 | 2.4081 |
| 1.4182 | 3.9042 | 15000 | 2.3990 |
| 1.4099 | 3.9693 | 15250 | 2.4105 |
| 1.4087 | 4.0344 | 15500 | 2.3989 |
| 1.3837 | 4.0994 | 15750 | 2.4094 |
| 1.4162 | 4.1645 | 16000 | 2.3978 |
| 1.3971 | 4.2296 | 16250 | 2.3848 |
| 1.3804 | 4.2946 | 16500 | 2.3963 |
| 1.374 | 4.3597 | 16750 | 2.3939 |
| 1.3999 | 4.4248 | 17000 | 2.3807 |
| 1.3926 | 4.4898 | 17250 | 2.3879 |
| 1.3823 | 4.5549 | 17500 | 2.3847 |
| 1.3768 | 4.6200 | 17750 | 2.3820 |
| 1.3572 | 4.6851 | 18000 | 2.3872 |
| 1.4148 | 4.7501 | 18250 | 2.3819 |
| 1.3667 | 4.8152 | 18500 | 2.3777 |
| 1.3943 | 4.8803 | 18750 | 2.3796 |
| 1.3851 | 4.9453 | 19000 | 2.3803 |
Framework versions
- Transformers 4.57.3
- Pytorch 2.9.1+cu128
- Datasets 4.4.2
- Tokenizers 0.22.1
- Downloads last month
- 536
Model tree for saracandu/stlenc-temp0.2
Unable to build the model tree, the base model loops to the model itself. Learn more.