przvl commited on
Commit
5d17ddf
·
verified ·
1 Parent(s): c8fe128

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -117,23 +117,23 @@ Predicted class: Populist (Confidence: 0.90)
117
 
118
  ## Evaluation
119
 
120
- For transparency, we compare this model with its smaller variant ([PopEuroBERT-210m](https://huggingface.co/przvl/PopEuroBERT-binary-210m)), both trained and evaluated on the same dataset and splits.
121
 
122
  ### Test Set Performance (Threshold = 0.5)
123
 
124
  | Model | Accuracy | Precision | Recall | F1 Score | Loss |
125
  |--------------------|----------|-----------|--------|----------|--------|
126
- | **210M** | 75.99% | 73.78% | 80.66% | 77.07% | 0.4959 |
127
- | **610M (this)** | 80.26% | 78.42% | 83.50% | 80.89% | 0.4631 |
128
 
129
  ### Test Set Performance (Optimized Threshold)
130
 
131
  | Model | Threshold | Accuracy | Precision | Recall | F1 Score |
132
  |--------------------|-----------|----------|-----------|--------|----------|
133
- | **210M** | 0.56 | 76.00% | 76.00% | 76.00% | 76.00% |
134
- | **610M (this)** | 0.43 | 79.81% | 76.63% | 85.78% | 80.94% |
135
 
136
- PopEuroBERT-610m consistently outperforms the 210m variant across all metrics. It especially improves recall and F1 score, suggesting better identification of populist speech. The decision threshold (0.43) was tuned for balanced performance.
137
 
138
  ## Limitations
139
 
 
117
 
118
  ## Evaluation
119
 
120
+ For transparency, we compare this model with its larger variant ([PopEuroBERT-610m](https://huggingface.co/przvl/PopEuroBERT-binary-610m)), both trained and evaluated on the same dataset and splits.
121
 
122
  ### Test Set Performance (Threshold = 0.5)
123
 
124
  | Model | Accuracy | Precision | Recall | F1 Score | Loss |
125
  |--------------------|----------|-----------|--------|----------|--------|
126
+ | **210M (this)** | 75.99% | 73.78% | 80.66% | 77.07% | 0.4959 |
127
+ | **610M** | 80.26% | 78.42% | 83.50% | 80.89% | 0.4631 |
128
 
129
  ### Test Set Performance (Optimized Threshold)
130
 
131
  | Model | Threshold | Accuracy | Precision | Recall | F1 Score |
132
  |--------------------|-----------|----------|-----------|--------|----------|
133
+ | **210M (this)** | 0.56 | 76.00% | 76.00% | 76.00% | 76.00% |
134
+ | **610M** | 0.43 | 79.81% | 76.63% | 85.78% | 80.94% |
135
 
136
+ While PopEuroBERT-210m performs well on the populism classification task, its larger variant shows stronger overall performance, especially in F1 score and recall.
137
 
138
  ## Limitations
139