rlcc-palate-upsample_replacement-absa-min

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1294
  • Accuracy: 0.7878
  • F1 Macro: 0.5375
  • Precision Macro: 0.5410
  • Recall Macro: 0.5771
  • Total Tf: [323, 87, 1143, 87]

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 36
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Macro Precision Macro Recall Macro Total Tf
1.1114 1.0 37 1.0849 0.8073 0.4646 0.4699 0.4711 [331, 79, 1151, 79]
1.0656 2.0 74 1.1060 0.8 0.4744 0.4575 0.5394 [328, 82, 1148, 82]
0.9251 3.0 111 1.0360 0.8024 0.5730 0.5780 0.5942 [329, 81, 1149, 81]
0.8019 4.0 148 1.1099 0.8341 0.5423 0.5462 0.5925 [342, 68, 1162, 68]
0.7155 5.0 185 1.2333 0.7976 0.5602 0.5848 0.6127 [327, 83, 1147, 83]
0.5969 6.0 222 1.2154 0.8341 0.5266 0.4951 0.5735 [342, 68, 1162, 68]
0.5489 7.0 259 1.3377 0.8 0.5650 0.5661 0.6076 [328, 82, 1148, 82]
0.584 8.0 296 1.4198 0.7976 0.5680 0.6274 0.6648 [327, 83, 1147, 83]
0.557 9.0 333 1.4447 0.7902 0.5346 0.5320 0.5644 [324, 86, 1144, 86]
0.5376 10.0 370 1.4859 0.7927 0.5551 0.6208 0.6404 [325, 85, 1145, 85]
0.4886 11.0 407 1.5739 0.7951 0.5528 0.5561 0.5920 [326, 84, 1146, 84]
0.4906 12.0 444 1.6267 0.8146 0.5814 0.5777 0.6177 [334, 76, 1154, 76]
0.4341 13.0 481 1.6743 0.8098 0.5803 0.5850 0.6221 [332, 78, 1152, 78]
0.4137 14.0 518 1.7262 0.7976 0.5643 0.5734 0.6220 [327, 83, 1147, 83]
0.3311 15.0 555 1.8798 0.8 0.5580 0.5589 0.5891 [328, 82, 1148, 82]
0.3054 16.0 592 1.8805 0.7756 0.5239 0.5372 0.5906 [318, 92, 1138, 92]
0.2563 17.0 629 1.9434 0.7878 0.5476 0.5629 0.6221 [323, 87, 1143, 87]
0.2458 18.0 666 1.9002 0.7976 0.5597 0.5626 0.6003 [327, 83, 1147, 83]
0.2234 19.0 703 1.9109 0.7878 0.5505 0.5560 0.6124 [323, 87, 1143, 87]
0.2291 20.0 740 1.9709 0.7927 0.5507 0.5555 0.5922 [325, 85, 1145, 85]
0.2231 21.0 777 2.0283 0.8024 0.5639 0.5662 0.5996 [329, 81, 1149, 81]
0.191 22.0 814 2.0057 0.7854 0.5355 0.5444 0.5812 [322, 88, 1142, 88]
0.1827 23.0 851 2.0780 0.8049 0.5658 0.5653 0.5977 [330, 80, 1150, 80]
0.1929 24.0 888 2.1094 0.7927 0.5455 0.5475 0.5830 [325, 85, 1145, 85]
0.1572 25.0 925 2.1294 0.7878 0.5375 0.5410 0.5771 [323, 87, 1143, 87]

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.1.0+cu118
  • Tokenizers 0.21.0
Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support