Update README.md
Browse files
README.md
CHANGED
|
@@ -86,9 +86,10 @@ Predicted class: Populist (Confidence: 0.90)
|
|
| 86 |
|
| 87 |
## Training Data
|
| 88 |
|
| 89 |
-
- **Dataset:** PopBERT
|
|
|
|
|
|
|
| 90 |
- **Preprocessing:**
|
| 91 |
-
- Removed duplicates.
|
| 92 |
- Converted labels to binary format (`populist = 1`, `neutral = 0`).
|
| 93 |
- Tokenized using **EuroBERT tokenizer** with a max length of `256` tokens.
|
| 94 |
|
|
@@ -108,9 +109,9 @@ Predicted class: Populist (Confidence: 0.90)
|
|
| 108 |
| Weight Decay | `0.0` |
|
| 109 |
| Gradient Accumulation | `2` |
|
| 110 |
| Warmup Ratio | `0.1` |
|
| 111 |
-
| Epochs | `
|
| 112 |
| Batch Size | `16` |
|
| 113 |
-
| Max Length | `
|
| 114 |
|
| 115 |
- **Mixed Precision (fp16):** Used for efficiency on GPU.
|
| 116 |
|
|
|
|
| 86 |
|
| 87 |
## Training Data
|
| 88 |
|
| 89 |
+
- **Dataset:** [PopBERT](https://github.com/luerhard/PopBERT)
|
| 90 |
+
- Sentence-level annotated German Bundestag speeches
|
| 91 |
+
- `train/test: 7017/1758`
|
| 92 |
- **Preprocessing:**
|
|
|
|
| 93 |
- Converted labels to binary format (`populist = 1`, `neutral = 0`).
|
| 94 |
- Tokenized using **EuroBERT tokenizer** with a max length of `256` tokens.
|
| 95 |
|
|
|
|
| 109 |
| Weight Decay | `0.0` |
|
| 110 |
| Gradient Accumulation | `2` |
|
| 111 |
| Warmup Ratio | `0.1` |
|
| 112 |
+
| Epochs | `2` |
|
| 113 |
| Batch Size | `16` |
|
| 114 |
+
| Max Length | `256` |
|
| 115 |
|
| 116 |
- **Mixed Precision (fp16):** Used for efficiency on GPU.
|
| 117 |
|