| Based on microsoft/deberta-v3-base, finetuned on a synthetic dataset (6 labels). | |
| Performance on test dataset: | |
| precision recall f1-score support | |
| 0 0.56 0.73 0.63 26 | |
| 1 0.70 1.00 0.82 28 | |
| 2 0.68 0.53 0.60 32 | |
| 3 0.97 1.00 0.99 33 | |
| 4 1.00 0.97 0.98 33 | |
| 5 0.52 0.33 0.41 36 | |
| accuracy 0.75 188 | |
| macro avg 0.74 0.76 0.74 188 | |
| weighted avg 0.74 0.75 0.74 188 | |
| Performance on similar benchmark: | |
| precision recall f1-score support | |
| 0 0.22 0.83 0.34 23 | |
| 1 0.50 0.01 0.03 75 | |
| 2 0.19 0.26 0.22 19 | |
| accuracy 0.21 117 | |
| macro avg 0.30 0.37 0.20 117 | |
| weighted avg 0.39 0.21 0.12 117 |