This model is based on svenbl80/deberta-v3-Base-finetuned-chatdoc-V5's model but further finetuned a synthetic dataset. It performs poorly on a different benchmark from the same document: precision recall f1-score support 0 0.19 0.22 0.20 23 1 0.62 0.44 0.52 75 2 0.00 0.00 0.00 19 accuracy 0.32 117 macro avg 0.27 0.22 0.24 117 weighted avg 0.44 0.32 0.37 117