roberta-large

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0756
Precision: 0.9480
Recall: 0.9449
F1: 0.9464
Accuracy: 0.9905

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 48

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.0	20	0.8473	0.0	0.0	0.0	0.7693
No log	2.0	40	0.3125	0.5063	0.4538	0.4786	0.9131
No log	3.0	60	0.1283	0.8118	0.8460	0.8286	0.9699
No log	4.0	80	0.0849	0.8241	0.8655	0.8443	0.9791
No log	5.0	100	0.0820	0.8208	0.8833	0.8509	0.9768
No log	6.0	120	0.0784	0.8640	0.9060	0.8845	0.9814
No log	7.0	140	0.0699	0.9290	0.9125	0.9207	0.9862
No log	8.0	160	0.0668	0.8835	0.9222	0.9025	0.9853
No log	9.0	180	0.0492	0.9208	0.9417	0.9311	0.9893
No log	10.0	200	0.0773	0.9104	0.9222	0.9163	0.9859
No log	11.0	220	0.0753	0.8771	0.9368	0.9060	0.9828
No log	12.0	240	0.0710	0.9179	0.9238	0.9208	0.9874
No log	13.0	260	0.0679	0.9028	0.9335	0.9179	0.9859
No log	14.0	280	0.0751	0.9175	0.9368	0.9270	0.9882
No log	15.0	300	0.0661	0.9146	0.9368	0.9255	0.9883
No log	16.0	320	0.0672	0.9368	0.9368	0.9368	0.9895
No log	17.0	340	0.0601	0.9211	0.9465	0.9337	0.9899
No log	18.0	360	0.0693	0.9441	0.9303	0.9371	0.9883
No log	19.0	380	0.0681	0.9255	0.9465	0.9359	0.9884
No log	20.0	400	0.0790	0.9350	0.9319	0.9334	0.9881
No log	21.0	420	0.0671	0.9383	0.9368	0.9376	0.9885
No log	22.0	440	0.0657	0.9327	0.9433	0.9380	0.9893
No log	23.0	460	0.0684	0.9370	0.9400	0.9385	0.9892
No log	24.0	480	0.0669	0.9226	0.9465	0.9344	0.9886
0.117	25.0	500	0.0691	0.9329	0.9465	0.9397	0.9887
0.117	26.0	520	0.0746	0.9493	0.9400	0.9446	0.9899
0.117	27.0	540	0.0749	0.9542	0.9465	0.9504	0.9900
0.117	28.0	560	0.0730	0.9435	0.9465	0.9450	0.9895
0.117	29.0	580	0.0697	0.9653	0.9465	0.9558	0.9906
0.117	30.0	600	0.0803	0.9554	0.9368	0.9460	0.9900
0.117	31.0	620	0.0838	0.9507	0.9384	0.9445	0.9895
0.117	32.0	640	0.0851	0.9445	0.9384	0.9415	0.9898
0.117	33.0	660	0.0783	0.9403	0.9449	0.9426	0.9892
0.117	34.0	680	0.0808	0.9372	0.9433	0.9402	0.9891
0.117	35.0	700	0.0823	0.9448	0.9433	0.9440	0.9898
0.117	36.0	720	0.0779	0.9511	0.9465	0.9488	0.9906
0.117	37.0	740	0.0751	0.9543	0.9481	0.9512	0.9908
0.117	38.0	760	0.0690	0.9514	0.9514	0.9514	0.9906
0.117	39.0	780	0.0710	0.9511	0.9465	0.9488	0.9906
0.117	40.0	800	0.0714	0.9495	0.9449	0.9472	0.9906
0.117	41.0	820	0.0738	0.9525	0.9433	0.9479	0.9908
0.117	42.0	840	0.0740	0.9480	0.9449	0.9464	0.9906
0.117	43.0	860	0.0749	0.9480	0.9449	0.9464	0.9906
0.117	44.0	880	0.0756	0.9526	0.9449	0.9487	0.9909
0.117	45.0	900	0.0752	0.9511	0.9465	0.9488	0.9908
0.117	46.0	920	0.0754	0.9480	0.9449	0.9464	0.9905
0.117	47.0	940	0.0755	0.9480	0.9449	0.9464	0.9905
0.117	48.0	960	0.0756	0.9480	0.9449	0.9464	0.9905

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.1.1
Tokenizers 0.22.1

Downloads last month: 9

Safetensors

Model size

0.4B params

Tensor type

F32