roberta-large

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0756
  • Precision: 0.9480
  • Recall: 0.9449
  • F1: 0.9464
  • Accuracy: 0.9905

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 48

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
No log 1.0 20 0.8473 0.0 0.0 0.0 0.7693
No log 2.0 40 0.3125 0.5063 0.4538 0.4786 0.9131
No log 3.0 60 0.1283 0.8118 0.8460 0.8286 0.9699
No log 4.0 80 0.0849 0.8241 0.8655 0.8443 0.9791
No log 5.0 100 0.0820 0.8208 0.8833 0.8509 0.9768
No log 6.0 120 0.0784 0.8640 0.9060 0.8845 0.9814
No log 7.0 140 0.0699 0.9290 0.9125 0.9207 0.9862
No log 8.0 160 0.0668 0.8835 0.9222 0.9025 0.9853
No log 9.0 180 0.0492 0.9208 0.9417 0.9311 0.9893
No log 10.0 200 0.0773 0.9104 0.9222 0.9163 0.9859
No log 11.0 220 0.0753 0.8771 0.9368 0.9060 0.9828
No log 12.0 240 0.0710 0.9179 0.9238 0.9208 0.9874
No log 13.0 260 0.0679 0.9028 0.9335 0.9179 0.9859
No log 14.0 280 0.0751 0.9175 0.9368 0.9270 0.9882
No log 15.0 300 0.0661 0.9146 0.9368 0.9255 0.9883
No log 16.0 320 0.0672 0.9368 0.9368 0.9368 0.9895
No log 17.0 340 0.0601 0.9211 0.9465 0.9337 0.9899
No log 18.0 360 0.0693 0.9441 0.9303 0.9371 0.9883
No log 19.0 380 0.0681 0.9255 0.9465 0.9359 0.9884
No log 20.0 400 0.0790 0.9350 0.9319 0.9334 0.9881
No log 21.0 420 0.0671 0.9383 0.9368 0.9376 0.9885
No log 22.0 440 0.0657 0.9327 0.9433 0.9380 0.9893
No log 23.0 460 0.0684 0.9370 0.9400 0.9385 0.9892
No log 24.0 480 0.0669 0.9226 0.9465 0.9344 0.9886
0.117 25.0 500 0.0691 0.9329 0.9465 0.9397 0.9887
0.117 26.0 520 0.0746 0.9493 0.9400 0.9446 0.9899
0.117 27.0 540 0.0749 0.9542 0.9465 0.9504 0.9900
0.117 28.0 560 0.0730 0.9435 0.9465 0.9450 0.9895
0.117 29.0 580 0.0697 0.9653 0.9465 0.9558 0.9906
0.117 30.0 600 0.0803 0.9554 0.9368 0.9460 0.9900
0.117 31.0 620 0.0838 0.9507 0.9384 0.9445 0.9895
0.117 32.0 640 0.0851 0.9445 0.9384 0.9415 0.9898
0.117 33.0 660 0.0783 0.9403 0.9449 0.9426 0.9892
0.117 34.0 680 0.0808 0.9372 0.9433 0.9402 0.9891
0.117 35.0 700 0.0823 0.9448 0.9433 0.9440 0.9898
0.117 36.0 720 0.0779 0.9511 0.9465 0.9488 0.9906
0.117 37.0 740 0.0751 0.9543 0.9481 0.9512 0.9908
0.117 38.0 760 0.0690 0.9514 0.9514 0.9514 0.9906
0.117 39.0 780 0.0710 0.9511 0.9465 0.9488 0.9906
0.117 40.0 800 0.0714 0.9495 0.9449 0.9472 0.9906
0.117 41.0 820 0.0738 0.9525 0.9433 0.9479 0.9908
0.117 42.0 840 0.0740 0.9480 0.9449 0.9464 0.9906
0.117 43.0 860 0.0749 0.9480 0.9449 0.9464 0.9906
0.117 44.0 880 0.0756 0.9526 0.9449 0.9487 0.9909
0.117 45.0 900 0.0752 0.9511 0.9465 0.9488 0.9908
0.117 46.0 920 0.0754 0.9480 0.9449 0.9464 0.9905
0.117 47.0 940 0.0755 0.9480 0.9449 0.9464 0.9905
0.117 48.0 960 0.0756 0.9480 0.9449 0.9464 0.9905

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.1.1
  • Tokenizers 0.22.1
Downloads last month
9
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support