You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

populism_xlmr_resumed

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 16
eval_batch_size: 16
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 256
total_eval_batch_size: 64
optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3.0

Training Loss	Epoch	Step	Accuracy	Validation Loss
1.5899	0.0161	1000	0.7136	1.3903
1.5196	0.0322	2000	0.7175	1.3585
1.4965	0.0483	3000	0.7196	1.3465
1.4825	0.0643	4000	0.7211	1.3382
1.4757	0.0804	5000	0.7219	1.3314
1.4743	0.0965	6000	0.7225	1.3285
1.4664	0.1126	7000	0.7229	1.3244
1.4611	0.1287	8000	0.7236	1.3189
1.4573	0.1448	9000	0.7243	1.3152
1.4479	0.1609	10000	0.7251	1.3094
1.4496	0.1769	11000	0.7254	1.3096
1.4468	0.1930	12000	0.7260	1.3062
1.4396	0.2091	13000	0.7263	1.3037
1.438	0.2252	14000	0.7267	1.2980
1.434	0.2413	15000	0.7270	1.2983
1.4314	0.2574	16000	0.7276	1.2940
1.4325	0.2735	17000	0.7280	1.2923
1.4239	0.2896	18000	0.7282	1.2936
1.4228	0.3056	19000	0.7287	1.2861
1.4232	0.3217	20000	0.7294	1.2822
1.416	0.3378	21000	0.7297	1.2822
1.4133	0.3539	22000	0.7300	1.2776
1.4178	0.3700	23000	0.7301	1.2800
1.4103	0.3861	24000	0.7307	1.2770
1.4053	0.4022	25000	0.7312	1.2719
1.402	0.4182	26000	0.7315	1.2718
1.4012	0.4343	27000	0.7316	1.2699
1.3982	0.4504	28000	0.7321	1.2678
1.3952	0.4665	29000	0.7322	1.2671
1.3961	0.4826	30000	0.7328	1.2627
1.3927	0.4987	31000	0.7330	1.2628
1.3925	0.5148	32000	0.7335	1.2579
1.3834	0.5308	33000	0.7336	1.2591
1.3821	0.5469	34000	0.7343	1.2572
1.3821	0.5630	35000	0.7342	1.2531
1.3834	0.5791	36000	0.7345	1.2525
1.3854	0.5952	37000	0.7348	1.2507
1.3788	0.6113	38000	0.7350	1.2494
1.3754	0.6274	39000	0.7355	1.2489
1.375	0.6435	40000	0.7358	1.2449
1.3738	0.6595	41000	0.7364	1.2435
1.3728	0.6756	42000	0.7362	1.2425
1.3675	0.6917	43000	0.7367	1.2422
1.367	0.7078	44000	0.7366	1.2406
1.3647	0.7239	45000	0.7372	1.2378
1.3624	0.7400	46000	0.7370	1.2386

Safetensors

Model size

0.3B params

Tensor type

F32