literal-support-gec

This model is a fine-tuned version of gotutiyan/gec-t5-large-clang8 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 64
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 1

Training Loss	Epoch	Step	Validation Loss
0.7081	0.0373	500	0.0491
0.4451	0.0745	1000	0.0393
0.3807	0.1118	1500	0.0361
0.3942	0.1491	2000	0.0366
0.3188	0.1864	2500	0.0331
0.2964	0.2236	3000	0.0318
0.3084	0.2609	3500	0.0317
0.2826	0.2982	4000	0.0309
0.2788	0.3354	4500	0.0309
0.2838	0.3727	5000	0.0301
0.2786	0.4100	5500	0.0291
0.2707	0.4473	6000	0.0283
0.2865	0.4845	6500	0.0289
0.2580	0.5218	7000	0.0278
0.2592	0.5591	7500	0.0270
0.2332	0.5964	8000	0.0273
0.2418	0.6336	8500	0.0269
0.2305	0.6709	9000	0.0264
0.2363	0.7082	9500	0.0261
0.2385	0.7454	10000	0.0259
0.2231	0.7827	10500	0.0256
0.2227	0.8200	11000	0.0254
0.2146	0.8573	11500	0.0253
0.2261	0.8945	12000	0.0252
0.2103	0.9318	12500	0.0251
0.2178	0.9691	13000	0.0248

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(2)

this model