0887a246b6fa3229d6abce9c7f0369a6

This model is a fine-tuned version of FacebookAI/xlm-roberta-large-finetuned-conll03-german on the nyu-mll/glue [stsb] dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Mse	Mae	R2
No log	0	0	7.6089	0	2.8981	7.6101	2.3264	-2.4043
No log	1	179	3.6230	0.0078	3.4540	3.6237	1.5317	-0.6210
No log	2	358	4.2873	0.0156	3.8612	4.2878	1.6838	-0.9181
No log	3	537	2.3153	0.0312	5.1330	2.3161	1.2865	-0.0361
No log	4	716	3.3932	0.0625	6.5927	3.3943	1.5444	-0.5184
No log	5	895	2.5877	0.125	9.2624	2.5884	1.3302	-0.1579
0.1469	6	1074	3.0851	0.25	12.8637	3.0857	1.4242	-0.3804
2.1914	7	1253	2.3641	0.5	20.2438	2.3649	1.2969	-0.0579

Safetensors

Model size

0.6B params

Tensor type

F32

Base model

Finetuned

(21)

this model