XLM-roberta-large-ftit-emb-lr01

This model is a fine-tuned version of Zamza/XLM-roberta-large-ftit-emb-4 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4811

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 22
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.6925 0.3026 10000 0.6256
0.6245 0.6052 20000 0.5774
0.6195 0.9079 30000 0.5596
0.6407 1.2105 40000 0.6097
0.6424 1.5131 50000 0.5653
0.6288 1.8157 60000 0.5666
0.5876 2.1183 70000 0.5434
0.5847 2.4209 80000 0.5424
0.5846 2.7236 90000 0.5644
0.5804 3.0262 100000 0.5419
0.5684 3.3288 110000 0.5305
0.5763 3.6314 120000 0.5350
0.5819 3.9340 130000 0.5270
0.5584 4.2367 140000 0.5296
0.5752 4.5393 150000 0.5318
0.5554 4.8419 160000 0.5205
0.5682 5.1445 170000 0.5303
0.5414 5.4471 180000 0.5199
0.5427 5.7497 190000 0.5101
0.5471 6.0525 200000 0.5161
0.5687 6.3552 210000 0.5159
0.5405 6.6578 220000 0.5229
0.5463 6.9604 230000 0.5193
0.5412 7.2630 240000 0.5147
0.5336 7.5656 250000 0.5097
0.5377 7.8683 260000 0.5032
0.5443 8.1709 270000 0.5103
0.5261 8.4735 280000 0.5069
0.5339 8.7761 290000 0.5056
0.5434 9.0787 300000 0.5048
0.5379 9.3813 310000 0.5016
0.527 9.6840 320000 0.5052
0.5446 9.9866 330000 0.5066
0.5351 10.2892 340000 0.4997
0.536 10.5918 350000 0.4956
0.5215 10.8944 360000 0.4969
0.5311 11.1970 370000 0.5092
0.5221 11.4997 380000 0.4936
0.5295 11.8024 390000 0.4897
0.5173 12.1051 400000 0.4980
0.5164 12.4077 410000 0.4858
0.5185 12.7103 420000 0.4967
0.5125 13.0129 430000 0.4973
0.5216 13.3155 440000 0.4900
0.5133 13.6182 450000 0.4878
0.5195 13.9208 460000 0.4938
0.5163 14.2234 470000 0.4940
0.5008 14.5260 480000 0.4925
0.5144 14.8286 490000 0.4885
0.5265 15.1312 500000 0.4925
0.5102 15.4339 510000 0.4957
0.5076 15.7365 520000 0.4923
0.5156 16.0391 530000 0.5032
0.5236 16.3417 540000 0.4974
0.5168 16.6443 550000 0.4826
0.4977 16.9470 560000 0.4860
0.5102 17.2496 570000 0.4889
0.4992 17.5523 580000 0.4789
0.516 17.8550 590000 0.4967
0.5018 18.1576 600000 0.4899
0.5094 18.4602 610000 0.4881
0.4991 18.7629 620000 0.4861
0.4955 19.0655 630000 0.4809
0.4965 19.3681 640000 0.4871
0.4937 19.6707 650000 0.4836
0.5048 19.9733 660000 0.4955
0.5019 20.2759 670000 0.4765
0.4912 20.5786 680000 0.4911
0.4891 20.8812 690000 0.4837
0.5084 21.1838 700000 0.4931
0.4945 21.4864 710000 0.4825
0.5014 21.7890 720000 0.4811

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
57
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Zamza/XLM-roberta-large-ftit-emb-lr01

Finetuned
(1)
this model
Finetunes
3 models