XPU_GEC_t5_char_nepali

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 5

Training Loss	Epoch	Step	Validation Loss
1.441	0.1463	1000	1.3825
1.3869	0.2926	2000	1.3707
1.4013	0.4389	3000	1.3782
1.4017	0.5852	4000	1.3901
1.4576	0.7316	5000	1.3772
1.385	0.8779	6000	1.3661
1.3838	1.0243	7000	1.3771
1.3816	1.1706	8000	1.3704
1.3717	1.3169	9000	1.3563
1.4436	1.4632	10000	1.3743
1.3712	1.6095	11000	1.3654
1.3618	1.7558	12000	1.3616
1.3572	1.9022	13000	1.3543
1.3633	2.0486	14000	1.3803
1.3562	2.1949	15000	1.3512
1.3506	2.3412	16000	1.3588
1.3506	2.4875	17000	1.3482
1.3442	2.6338	18000	1.3412
1.4396	2.7801	19000	1.3690
1.3469	2.9264	20000	1.3529
1.3424	3.0729	21000	1.3411
1.3401	3.2192	22000	1.3394
1.3395	3.3655	23000	1.3493
1.3281	3.5118	24000	1.3266
1.3318	3.6581	25000	1.3248
1.3233	3.8044	26000	1.3146
1.3183	3.9507	27000	1.3146
1.3197	4.0972	28000	1.3140
1.3158	4.2435	29000	1.3088
1.4365	4.3898	30000	1.3485
1.3164	4.5361	31000	1.3098
1.3095	4.6824	32000	1.3065
1.3089	4.8287	33000	1.3057
1.3065	4.9750	34000	1.3049

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support