a4f4226f4efdccf6944a4e9133911bdb

This model is a fine-tuned version of FacebookAI/xlm-roberta-large-finetuned-conll02-dutch on the nyu-mll/glue [stsb] dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Mse	Mae	R2
No log	0	0	6.4701	0	2.9385	6.4713	2.1186	-1.8948
No log	1	179	3.6268	0.0078	3.5081	3.6273	1.5354	-0.6226
No log	2	358	4.5445	0.0156	3.9286	4.5451	1.7311	-1.0332
No log	3	537	5.0040	0.0312	5.0996	5.0050	1.7878	-1.2389
No log	4	716	4.0404	0.0625	7.0456	4.0415	1.6692	-0.8079
No log	5	895	2.6657	0.125	9.1655	2.6664	1.3423	-0.1928
0.177	6	1074	2.6657	0.25	13.3445	2.6664	1.3423	-0.1928
2.2141	7	1253	2.4129	0.5	20.8253	2.4136	1.3029	-0.0797
2.2116	8.0	1432	2.4547	1.0	35.0898	2.4554	1.3095	-0.0984
2.232	9.0	1611	2.3530	1.0	34.7684	2.3537	1.2955	-0.0529
2.2272	10.0	1790	2.3054	1.0	34.4248	2.3061	1.2888	-0.0316
2.1378	11.0	1969	2.3530	1.0	33.1688	2.3538	1.2955	-0.0529
2.0875	12.0	2148	2.6257	1.0	34.4306	2.6264	1.3354	-0.1749
2.1826	13.0	2327	2.2503	1.0	34.6960	2.2511	1.2867	-0.0070
2.2365	14.0	2506	2.2826	1.0	33.9542	2.2834	1.2866	-0.0215
2.2183	15.0	2685	2.7976	1.0	34.8357	2.7983	1.3682	-0.2518
2.2284	16.0	2864	2.4264	1.0	33.1563	2.4271	1.3045	-0.0857
2.1293	17.0	3043	2.4263	1.0	33.0967	2.4271	1.3045	-0.0857

Safetensors

Model size

0.6B params

Tensor type

F32

Base model

Finetuned

(20)

this model