trainer_output

This model is a fine-tuned version of AlexeySorokin/ossbert-onc-unlab-from_multilingual-bs64-5epochs on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 5

Training Loss	Epoch	Step	Validation Loss	Accuracy	Sentence accuracy
No log	0.3663	200	0.7237	85.5158	26.0550
No log	0.7326	400	0.4818	89.8049	35.5963
1.0871	1.0989	600	0.3723	91.5687	41.6514
1.0871	1.4652	800	0.3279	92.7312	44.9541
0.3561	1.8315	1000	0.2988	93.3592	51.3761
0.3561	2.1978	1200	0.2634	94.0673	54.8624
0.3561	2.5641	1400	0.2602	94.4415	54.6789
0.2248	2.9304	1600	0.2505	94.5216	55.5963
0.2248	3.2967	1800	0.2429	94.8557	56.1468
0.1569	3.6630	2000	0.2391	94.8824	56.6972
0.1569	4.0293	2200	0.2377	94.9359	58.3486
0.1569	4.3956	2400	0.2349	95.1096	58.5321
0.1158	4.7619	2600	0.2332	95.2833	59.6330

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

Finetuned

(1)

this model