char-text-reversal

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.0816
Char Accuracy: 0.0065
Sequence Accuracy: 0.0
Edit Distance: 38.583

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 1000

Training results

Training Loss	Epoch	Step	Validation Loss	Char Accuracy	Edit Distance
4.1717	1.0	79	3.7332	0.0319	131.1735
3.4932	2.0	158	3.2892	0.0011	129.146
3.1822	3.0	237	3.0756	0.0	125.971
3.0081	4.0	316	2.9370	0.0	122.952
2.8946	5.0	395	2.8457	0.0	122.085
2.8162	6.0	474	2.7805	0.0000	121.204
2.7578	7.0	553	2.7284	0.0000	120.8485
2.7107	8.0	632	2.6850	0.0	120.5575
2.6695	9.0	711	2.6455	0.0000	120.3835
2.632	10.0	790	2.6074	0.0000	120.0615
2.5971	11.0	869	2.5695	0.0001	117.5055
2.5649	12.0	948	2.5360	0.0002	108.9205
2.5353	13.0	1027	2.5029	0.0003	95.6955
2.506	14.0	1106	2.4734	0.0004	85.586
2.4807	15.0	1185	2.4449	0.0005	77.844
2.455	16.0	1264	2.4136	0.0011	73.5575
2.4185	17.0	1343	2.3630	0.0013	68.8565
2.371	18.0	1422	2.2994	0.0024	64.448
2.3213	19.0	1501	2.2370	0.0027	62.501
2.2707	20.0	1580	2.1751	0.0039	59.419
2.227	21.0	1659	2.1183	0.0037	58.6405
2.182	22.0	1738	2.0610	0.0043	56.788
2.1396	23.0	1817	2.0002	0.0044	55.4195
2.0969	24.0	1896	1.9433	0.0046	54.239
2.0581	25.0	1975	1.8935	0.0046	52.833
2.025	26.0	2054	1.8459	0.0037	51.9935
1.9885	27.0	2133	1.7941	0.0043	50.6845
1.9587	28.0	2212	1.7568	0.0042	49.62
1.93	29.0	2291	1.7101	0.0047	48.6285
1.8983	30.0	2370	1.6641	0.0050	47.612
1.8693	31.0	2449	1.6341	0.0054	46.9725
1.8421	32.0	2528	1.5895	0.0049	46.026
1.8157	33.0	2607	1.5549	0.0057	45.169
1.7952	34.0	2686	1.5340	0.0058	44.602
1.7736	35.0	2765	1.4917	0.0065	43.823
1.7483	36.0	2844	1.4561	0.0055	43.098
1.7218	37.0	2923	1.4206	0.0071	42.265
1.6995	38.0	3002	1.3885	0.0065	41.419
1.6819	39.0	3081	1.3714	0.0057	41.078
1.6641	40.0	3160	1.3450	0.0066	40.324
1.6437	41.0	3239	1.3164	0.0053	39.8805
1.6198	42.0	3318	1.2894	0.0050	39.559
1.6045	43.0	3397	1.2686	0.0060	39.1475
1.5891	44.0	3476	1.2373	0.0069	38.3675
1.5774	45.0	3555	1.2252	0.0058	38.3125
1.5608	46.0	3634	1.2069	0.0056	38.1185
1.5488	47.0	3713	1.1713	0.0062	37.791
1.5265	48.0	3792	1.1443	0.0073	38.148
1.5095	49.0	3871	1.1223	0.0060	38.268
1.4939	50.0	3950	1.0998	0.0066	38.621
1.4799	51.0	4029	1.0816	0.0065	38.583

Framework versions

Transformers 4.55.4
Pytorch 2.8.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Safetensors

Model size

279k params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support