constrative_keyphrases_v3

This model is a fine-tuned version of Salesforce/codet5p-220m on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 32
total_train_batch_size: 128
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 8
label_smoothing_factor: 0.1

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
75.9631	1.0	1875	4.0069	30.2366	9.2022	26.2162	26.265	12.8157
68.1883	2.0	3750	3.9019	31.9774	11.071	28.1083	28.1585	13.0926
64.5175	3.0	5625	3.8668	32.5759	11.5353	28.6668	28.7146	12.9540
60.7772	4.0	7500	3.8535	33.3368	12.0544	29.2718	29.3209	13.1765
59.2185	5.0	9375	3.8452	33.4145	12.11	29.4074	29.463	12.8319
57.6426	6.0	11250	3.8529	33.4661	12.2255	29.4777	29.5318	13.1448
56.6323	7.0	13125	3.8465	33.6485	12.3814	29.6519	29.7071	13.1647
56.3368	8.0	15000	3.8532	33.7078	12.4091	29.6775	29.7329	13.1440

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(95)

this model