xlm-roberta-large-bm

This model is a fine-tuned version of oza75/xlm-roberta-large-bm-cpt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.6527
  • Accuracy: 0.7539

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.75e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 3
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 384
  • total_eval_batch_size: 96
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 0.06
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
23.4593 2.1994 200 5.6475 0.6440
20.8680 4.3989 400 5.0077 0.6808
19.5976 6.5983 600 4.6963 0.6985
18.6174 8.7978 800 4.4655 0.7104
17.8897 10.9972 1000 4.3012 0.7211
17.1157 13.1884 1200 4.2094 0.7253
16.6388 15.3878 1400 4.0842 0.7310
16.5434 17.5873 1600 4.0012 0.7376
16.2096 19.7867 1800 3.9556 0.7376
15.9932 21.9861 2000 3.8802 0.7426
15.3912 24.1773 2200 3.8404 0.7442
15.2444 26.3767 2400 3.7937 0.7475
15.3315 28.5762 2600 3.7470 0.7488
15.2022 30.7756 2800 3.7129 0.7513
15.1072 32.9751 3000 3.7143 0.7516
14.8385 35.1662 3200 3.7064 0.7505
14.7511 37.3657 3400 3.6804 0.7535
14.9010 39.5651 3600 3.6705 0.7533
14.8393 41.7645 3800 3.6890 0.7521
14.8144 43.9640 4000 3.6512 0.7547
14.5626 46.1551 4200 3.6255 0.7557
14.5697 48.3546 4400 3.6409 0.7549

Framework versions

  • Transformers 5.3.0.dev0
  • Pytorch 2.4.1+cu124
  • Datasets 4.6.1
  • Tokenizers 0.22.2
Downloads last month
195
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for oza75/xlm-roberta-large-bm

Finetuned
(1)
this model