przvl
/

PopEuroBERT-binary-210m

Text Classification

political-speech

Model card Files Files and versions

przvl commited on Mar 21, 2025

Commit

5d17ddf

·

verified ·

1 Parent(s): c8fe128

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -117,23 +117,23 @@ Predicted class: Populist (Confidence: 0.90)
 ## Evaluation
-For transparency, we compare this model with its smaller variant ([PopEuroBERT-210m](https://huggingface.co/przvl/PopEuroBERT-binary-210m)), both trained and evaluated on the same dataset and splits.
 ### Test Set Performance (Threshold = 0.5)
 | Model              | Accuracy | Precision | Recall | F1 Score | Loss   |
 |--------------------|----------|-----------|--------|----------|--------|
-| **210M**           | 75.99%   | 73.78%    | 80.66% | 77.07%   | 0.4959 |
-| **610M (this)**    | 80.26%   | 78.42%    | 83.50% | 80.89%   | 0.4631 |
 ### Test Set Performance (Optimized Threshold)
 | Model              | Threshold | Accuracy | Precision | Recall | F1 Score |
 |--------------------|-----------|----------|-----------|--------|----------|
-| **210M**           | 0.56      | 76.00%   | 76.00%    | 76.00% | 76.00%   |
-| **610M (this)**    | 0.43      | 79.81%   | 76.63%    | 85.78% | 80.94%   |
-PopEuroBERT-610m consistently outperforms the 210m variant across all metrics. It especially improves recall and F1 score, suggesting better identification of populist speech. The decision threshold (0.43) was tuned for balanced performance.
 ## Limitations

 ## Evaluation
+For transparency, we compare this model with its larger variant ([PopEuroBERT-610m](https://huggingface.co/przvl/PopEuroBERT-binary-610m)), both trained and evaluated on the same dataset and splits.
 ### Test Set Performance (Threshold = 0.5)
 | Model              | Accuracy | Precision | Recall | F1 Score | Loss   |
 |--------------------|----------|-----------|--------|----------|--------|
+| **210M (this)**    | 75.99%   | 73.78%    | 80.66% | 77.07%   | 0.4959 |
+| **610M**           | 80.26%   | 78.42%    | 83.50% | 80.89%   | 0.4631 |
 ### Test Set Performance (Optimized Threshold)
 | Model              | Threshold | Accuracy | Precision | Recall | F1 Score |
 |--------------------|-----------|----------|-----------|--------|----------|
+| **210M (this)**    | 0.56      | 76.00%   | 76.00%    | 76.00% | 76.00%   |
+| **610M**           | 0.43      | 79.81%   | 76.63%    | 85.78% | 80.94%   |
+While PopEuroBERT-210m performs well on the populism classification task, its larger variant shows stronger overall performance, especially in F1 score and recall.
 ## Limitations