Configuration Parsing Warning: In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string

stlenc-distilled-v2

This model is a fine-tuned version of saracandu/stlenc-distilled-v2 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.3803

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss
3.3937	0.0651	250	3.2997
3.2115	0.1301	500	3.2767
3.2025	0.1952	750	3.2675
3.1722	0.2603	1000	3.2447
3.0839	0.3254	1250	3.2229
2.8488	0.3904	1500	3.1718
2.7056	0.4555	1750	3.0489
2.5888	0.5206	2000	2.9679
2.395	0.5856	2250	2.9154
2.3511	0.6507	2500	2.9001
2.1649	0.7158	2750	2.8829
1.9847	0.7808	3000	2.8114
1.8525	0.8459	3250	2.7843
1.7891	0.9110	3500	2.7517
1.7504	0.9761	3750	2.7173
1.6783	1.0411	4000	2.6863
1.6719	1.1062	4250	2.6576
1.6199	1.1713	4500	2.6408
1.6311	1.2363	4750	2.6177
1.6007	1.3014	5000	2.6086
1.5567	1.3665	5250	2.5955
1.5692	1.4315	5500	2.6041
1.5537	1.4966	5750	2.6022
1.5829	1.5617	6000	2.5976
1.5861	1.6268	6250	2.5696
1.5579	1.6918	6500	2.5567
1.5477	1.7569	6750	2.5426
1.5365	1.8220	7000	2.5333
1.5601	1.8870	7250	2.5394
1.517	1.9521	7500	2.5406
1.487	2.0172	7750	2.5245
1.5409	2.0822	8000	2.5258
1.5186	2.1473	8250	2.5243
1.4841	2.2124	8500	2.4957
1.5096	2.2775	8750	2.5002
1.4841	2.3425	9000	2.4821
1.4803	2.4076	9250	2.4835
1.4704	2.4727	9500	2.4936
1.4926	2.5377	9750	2.4756
1.4721	2.6028	10000	2.4734
1.4608	2.6679	10250	2.4701
1.4623	2.7330	10500	2.4784
1.46	2.7980	10750	2.4586
1.4624	2.8631	11000	2.4566
1.485	2.9282	11250	2.4747
1.4407	2.9932	11500	2.4524
1.4803	3.0583	11750	2.4479
1.4343	3.1234	12000	2.4495
1.4391	3.1884	12250	2.4430
1.4473	3.2535	12500	2.4361
1.431	3.3186	12750	2.4406
1.4179	3.3837	13000	2.4319
1.427	3.4487	13250	2.4221
1.4248	3.5138	13500	2.4345
1.4316	3.5789	13750	2.4209
1.4401	3.6439	14000	2.4184
1.4158	3.7090	14250	2.4106
1.4205	3.7741	14500	2.4091
1.4273	3.8391	14750	2.4081
1.4182	3.9042	15000	2.3990
1.4099	3.9693	15250	2.4105
1.4087	4.0344	15500	2.3989
1.3837	4.0994	15750	2.4094
1.4162	4.1645	16000	2.3978
1.3971	4.2296	16250	2.3848
1.3804	4.2946	16500	2.3963
1.374	4.3597	16750	2.3939
1.3999	4.4248	17000	2.3807
1.3926	4.4898	17250	2.3879
1.3823	4.5549	17500	2.3847
1.3768	4.6200	17750	2.3820
1.3572	4.6851	18000	2.3872
1.4148	4.7501	18250	2.3819
1.3667	4.8152	18500	2.3777
1.3943	4.8803	18750	2.3796
1.3851	4.9453	19000	2.3803

Framework versions

Transformers 4.57.3
Pytorch 2.9.1+cu128
Datasets 4.4.2
Tokenizers 0.22.1

Downloads last month: 536

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for saracandu/stlenc-temp0.2

Unable to build the model tree, the base model loops to the model itself. Learn more.