flan-t5-rouge-durga-q5-clean-2

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 3
eval_batch_size: 3
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
1.3408	1.0	34	1.4618	0.2667	0.0665	0.2596	0.2602
0.8265	2.0	68	0.9113	0.3595	0.1570	0.3484	0.3491
0.6005	3.0	102	0.6349	0.3902	0.1965	0.3804	0.3836
0.7385	4.0	136	0.4371	0.4156	0.2511	0.4098	0.4121
0.4385	5.0	170	0.3191	0.4665	0.2901	0.4620	0.4636
0.5228	6.0	204	0.2193	0.4971	0.3658	0.4899	0.4932
0.7345	7.0	238	0.1589	0.5328	0.4149	0.5317	0.5341
0.6454	8.0	272	0.1229	0.5480	0.4471	0.5447	0.5496
0.1834	9.0	306	0.0956	0.5834	0.4832	0.5813	0.5832
0.1491	10.0	340	0.0741	0.5920	0.4950	0.5871	0.5894
0.224	11.0	374	0.0631	0.5901	0.5017	0.5887	0.5924
0.1733	12.0	408	0.0417	0.6601	0.5955	0.6572	0.6587
0.0927	13.0	442	0.0462	0.6558	0.5896	0.6527	0.6553
0.106	14.0	476	0.0296	0.6695	0.6164	0.6681	0.6715
0.1517	15.0	510	0.0223	0.7063	0.6618	0.7032	0.7036
0.0332	16.0	544	0.0249	0.7100	0.6666	0.7079	0.7073
0.1027	17.0	578	0.0179	0.7063	0.6558	0.7023	0.7026
0.0683	18.0	612	0.0131	0.7212	0.6782	0.7186	0.7202
0.003	19.0	646	0.0113	0.7287	0.6950	0.7256	0.7282
0.0083	20.0	680	0.0145	0.7123	0.6573	0.7105	0.7118
0.0005	21.0	714	0.0060	0.7197	0.6780	0.7169	0.7187
0.002	22.0	748	0.0132	0.7249	0.6897	0.7217	0.7246
0.0282	23.0	782	0.0034	0.7335	0.7000	0.7304	0.7329
0.0011	24.0	816	0.0042	0.7337	0.7008	0.7305	0.7321
0.012	25.0	850	0.0044	0.7335	0.7000	0.7304	0.7329
0.0083	26.0	884	0.0041	0.7334	0.6999	0.7302	0.7322
0.0172	27.0	918	0.0034	0.7280	0.6906	0.7250	0.7259
0.0125	28.0	952	0.0039	0.7279	0.6909	0.7253	0.7276
0.0162	29.0	986	0.0038	0.7335	0.7000	0.7304	0.7329
0.0172	30.0	1020	0.0035	0.7335	0.7000	0.7304	0.7329

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(908)

this model