roberta-base-multi-head

This model is a fine-tuned version of roberta-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4882
  • Accuracy: 0.5566
  • F1 Macro: 0.5333
  • F1 Micro: 0.5566
  • Precision Macro: 0.5431
  • Recall Macro: 0.5389
  • Roc Auc: 0.7826

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Macro F1 Micro Precision Macro Recall Macro Roc Auc
No log 0.1304 200 0.6818 0.2672 0.1648 0.2672 0.1218 0.2567 0.4838
No log 0.2609 400 0.6261 0.3230 0.1221 0.3230 0.0808 0.25 0.5071
0.6589 0.3913 600 0.5625 0.3902 0.2186 0.3902 0.2036 0.2855 0.5948
0.6589 0.5217 800 0.5461 0.4307 0.2771 0.4307 0.3373 0.3294 0.6677
0.5528 0.6522 1000 0.5142 0.4806 0.3522 0.4806 0.4562 0.3832 0.7032
0.5528 0.7826 1200 0.5025 0.4966 0.3990 0.4966 0.4866 0.4231 0.7188
0.5528 0.9130 1400 0.5006 0.4939 0.4140 0.4939 0.4853 0.4429 0.7312
0.5111 1.0430 1600 0.4903 0.5165 0.4163 0.5165 0.5065 0.4369 0.7386
0.5111 1.1735 1800 0.4821 0.5267 0.4650 0.5267 0.5003 0.4699 0.7494
0.4847 1.3039 2000 0.4803 0.5273 0.4900 0.5273 0.5013 0.4970 0.7582
0.4847 1.4343 2200 0.4742 0.5438 0.5020 0.5438 0.5153 0.5015 0.7637
0.4847 1.5648 2400 0.4672 0.5476 0.4998 0.5476 0.5270 0.4976 0.7692
0.47 1.6952 2600 0.4743 0.5396 0.4820 0.5396 0.5346 0.4885 0.7650
0.47 1.8256 2800 0.4675 0.5512 0.5104 0.5512 0.5282 0.5029 0.7734
0.4651 1.9561 3000 0.4671 0.5436 0.5151 0.5436 0.5211 0.5190 0.7747
0.4651 2.0861 3200 0.4631 0.5643 0.5269 0.5643 0.5431 0.5209 0.7804
0.4651 2.2165 3400 0.4681 0.5445 0.5109 0.5445 0.5359 0.5207 0.7798
0.4415 2.3469 3600 0.4695 0.5459 0.5114 0.5459 0.5400 0.5218 0.7801
0.4415 2.4774 3800 0.4607 0.5639 0.5358 0.5639 0.5457 0.5335 0.7843
0.4335 2.6078 4000 0.4649 0.5525 0.5283 0.5525 0.5354 0.5349 0.7830
0.4335 2.7382 4200 0.4676 0.5457 0.5225 0.5457 0.5370 0.5348 0.7854
0.4335 2.8687 4400 0.4581 0.5606 0.5272 0.5606 0.5482 0.5250 0.7854
0.4347 2.9991 4600 0.4612 0.5650 0.5336 0.5650 0.5425 0.5341 0.7853
0.4347 3.1291 4800 0.4654 0.5580 0.5302 0.5580 0.5410 0.5358 0.7856
0.4048 3.2596 5000 0.4659 0.5706 0.5452 0.5706 0.5478 0.5463 0.7873
0.4048 3.3900 5200 0.4627 0.5692 0.5346 0.5692 0.5538 0.5311 0.7859
0.4048 3.5204 5400 0.4733 0.5557 0.5371 0.5557 0.5354 0.5451 0.7858
0.3995 3.6509 5600 0.4755 0.5538 0.5267 0.5538 0.5426 0.5308 0.7857
0.3995 3.7813 5800 0.4759 0.5467 0.5238 0.5467 0.5383 0.5342 0.7860
0.4016 3.9117 6000 0.4698 0.5566 0.5302 0.5566 0.5392 0.5368 0.7859
0.4016 4.0417 6200 0.4786 0.5646 0.5389 0.5646 0.5463 0.5369 0.7830
0.4016 4.1722 6400 0.4840 0.5636 0.5342 0.5636 0.5409 0.5319 0.7814
0.3723 4.3026 6600 0.4760 0.5653 0.5431 0.5653 0.5435 0.5457 0.7855
0.3723 4.4330 6800 0.4821 0.5632 0.5340 0.5632 0.5460 0.5348 0.7829
0.3682 4.5635 7000 0.4882 0.5566 0.5333 0.5566 0.5431 0.5389 0.7826

Framework versions

  • Transformers 4.53.1
  • Pytorch 2.6.0+cu124
  • Datasets 2.14.4
  • Tokenizers 0.21.2
Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DayCardoso/roberta-base-multi-head

Finetuned
(2080)
this model