| Based on microsoft/deberta-v3-base, finetuned on a synthetic dataset (6 labels were converted to 3 labels). | |
| Performance on test dataset: | |
| precision recall f1-score support | |
| 0 0.98 0.99 0.98 94 | |
| 1 0.96 0.96 0.96 28 | |
| 2 1.00 0.98 0.99 66 | |
| accuracy 0.98 188 | |
| macro avg 0.98 0.98 0.98 188 | |
| weighted avg 0.98 0.98 0.98 188 | |
| Performance on similar benchmark: | |
| precision recall f1-score support | |
| 0 0.13 0.52 0.21 23 | |
| 1 0.44 0.15 0.22 75 | |
| 2 0.00 0.00 0.00 19 | |
| accuracy 0.20 117 | |
| macro avg 0.19 0.22 0.14 117 | |
| weighted avg 0.31 0.20 0.18 117 | |