polar3

This model is a fine-tuned version of microsoft/deberta-v3-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 100

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Precision	Recall
0.6437	4.7619	100	0.6451	0.6357	0.4941	0.4041	0.6357
0.6315	9.5238	200	0.6163	0.6372	0.4976	0.7690	0.6372
0.6185	14.2857	300	0.5877	0.6558	0.5621	0.6656	0.6558
0.5981	19.0476	400	0.5718	0.6713	0.5907	0.6980	0.6713
0.5733	23.8095	500	0.5548	0.7023	0.6556	0.7159	0.7023
0.5597	28.5714	600	0.5411	0.7256	0.7070	0.7208	0.7256
0.5608	33.3333	700	0.5329	0.7287	0.7097	0.7250	0.7287
0.5588	38.0952	800	0.5269	0.7473	0.7445	0.7434	0.7473
0.5375	42.8571	900	0.5199	0.7380	0.7236	0.7334	0.7380
0.5352	47.6190	1000	0.5279	0.7054	0.6546	0.7296	0.7054
0.5461	52.3810	1100	0.5118	0.7395	0.7233	0.7365	0.7395
0.5356	57.1429	1200	0.5212	0.7116	0.6642	0.7364	0.7116
0.5313	61.9048	1300	0.5093	0.7597	0.7598	0.7599	0.7597
0.5327	66.6667	1400	0.5051	0.7411	0.7229	0.7402	0.7411
0.5403	71.4286	1500	0.5077	0.7333	0.7076	0.7382	0.7333
0.5456	76.1905	1600	0.5043	0.7349	0.7131	0.7357	0.7349
0.5342	80.9524	1700	0.5050	0.7318	0.7070	0.7348	0.7318
0.5307	85.7143	1800	0.5016	0.7364	0.7164	0.7359	0.7364
0.5192	90.4762	1900	0.4999	0.7457	0.7310	0.7430	0.7457
0.5404	95.2381	2000	0.5012	0.7349	0.7144	0.7343	0.7349
0.5241	100.0	2100	0.5006	0.7411	0.7223	0.7408	0.7411

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

(562)

this model