gs-GreBerta

This model is a fine-tuned version of bowphs/GreBerta on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 128
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 0.06
num_epochs: 10

Training Loss	Epoch	Step	Validation Loss	Bertscore Precision Top1	Bertscore Recall Top1	Bertscore F1 Top1	Bertscore Precision Top1 Mean	Bertscore Recall Top1 Mean	Bertscore F1 Top1 Mean	Bertscore Precision Top3	Bertscore Recall Top3	Bertscore F1 Top3	Bertscore Precision Top3 Mean	Bertscore Recall Top3 Mean	Bertscore F1 Top3 Mean	Bertscore Precision Top5	Bertscore Recall Top5	Bertscore F1 Top5	Bertscore Precision Top5 Mean	Bertscore Recall Top5 Mean	Bertscore F1 Top5 Mean	Bertscore Precision Top10	Bertscore Recall Top10	Bertscore F1 Top10	Bertscore Precision Top10 Mean	Bertscore Recall Top10 Mean	Bertscore F1 Top10 Mean
0.3250	1.0	15003	1.3429	66.2173	68.9140	67.5100	66.2173	68.9140	67.5100	71.9054	73.5031	72.6034	67.8763	70.1834	68.9820	73.9242	75.1918	74.4533	68.2901	70.5052	69.3483	76.2096	76.9568	76.4491	68.8665	70.8638	69.8182
0.8377	2.0	30006	1.2719	66.5523	68.0360	67.2582	66.5523	68.0360	67.2582	73.6488	74.1934	73.8518	68.2843	69.6391	68.9295	75.2507	75.4052	75.2488	68.9805	70.2429	69.5778	77.6857	77.1190	77.3142	69.5964	70.6907	70.1106
1.3269	3.0	45009	1.3021	66.9389	68.9427	67.8994	66.9389	68.9427	67.8994	72.3956	73.3803	72.8133	68.4080	70.1676	69.2527	74.3025	74.7306	74.4179	68.9186	70.5594	69.7031	77.2639	76.8102	76.9434	69.4559	70.8244	70.1060
1.3897	4.0	60012	1.4041	68.6956	69.6346	69.1245	68.6956	69.6346	69.1245	73.8315	74.0674	73.8812	69.9673	70.7775	70.3360	75.5542	75.4416	75.4154	69.9468	70.8770	70.3729	77.5750	77.0415	77.2341	69.6853	70.7694	70.1878

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

(4)

this model