trainer_output

This model is a fine-tuned version of AlexeySorokin/ossbert-onc-unlab-from_multilingual-bs64-5epochs on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 25

Training Loss	Epoch	Step	Validation Loss	Accuracy	Sentence accuracy
1.0324	1.0	546	0.3769	91.7684	42.9358
0.3326	2.0	1092	0.2562	94.0969	51.9266
0.2101	3.0	1638	0.2329	94.6004	54.1284
0.1504	4.0	2184	0.2218	95.0661	56.3303
0.1165	5.0	2730	0.2194	95.1794	57.0642
0.0851	6.0	3276	0.2356	95.0535	56.8807
0.0652	7.0	3822	0.2388	95.4311	58.7156
0.0489	8.0	4368	0.2352	95.5192	59.0826
0.0377	9.0	4914	0.2456	95.5947	59.0826
0.0281	10.0	5460	0.2531	95.7961	60.9174
0.0181	11.0	6006	0.2716	95.5192	59.2661
0.0151	12.0	6552	0.2742	95.5570	60.0
0.0109	13.0	7098	0.2793	95.8213	60.5505
0.008	14.0	7644	0.2858	95.8464	59.4495
0.0072	15.0	8190	0.3016	95.8087	58.7156
0.0055	16.0	8736	0.2972	95.8213	58.5321
0.0055	17.0	9282	0.3029	96.0227	60.9174
0.0037	18.0	9828	0.3117	95.8842	60.0
0.0043	19.0	10374	0.3085	95.8842	60.0
0.0027	20.0	10920	0.3232	95.9597	61.4679
0.0024	21.0	11466	0.3169	95.9723	60.7339
0.0017	22.0	12012	0.3251	95.9471	60.7339
0.0012	23.0	12558	0.3221	96.0352	60.9174
0.0015	24.0	13104	0.3222	95.9597	60.3670
0.0013	25.0	13650	0.3225	96.0101	60.3670

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

Finetuned

(4)

this model