Update README.md
Browse files
README.md
CHANGED
|
@@ -52,7 +52,7 @@ This model is based on FacebookAI/xlm-roberta-large and was trained in a two-ste
|
|
| 52 |
|
| 53 |
### Training Data
|
| 54 |
|
| 55 |
-
The model was trained on two datasets, each based on the data from partypress/partypress-multilingual. The first dataset was weakly labeled using GPT-4o. The prompt contained the label description taken from [Erfort et al. (2023)](https://journals.sagepub.com/doi/10.1177/20531680231183512). The weakly labeled dataset contains 32,060 press releases.
|
| 56 |
The second dataset is the human-annotated dataset that is used for training partypress/partypress-multilingual. For training only the single-coded examples were used (24,117). Evaluation was performed on the data that is annotated by two human coders per example (3,121).
|
| 57 |
|
| 58 |
|
|
|
|
| 52 |
|
| 53 |
### Training Data
|
| 54 |
|
| 55 |
+
The model was trained on two datasets, each based on the data from partypress/partypress-multilingual. The first dataset was weakly labeled using GPT-4o. The [prompt](https://huggingface.co/Sami92/XLM-R-Large-PartyPress/blob/main/FinalPromptPartyPress.txt) contained the label description taken from [Erfort et al. (2023)](https://journals.sagepub.com/doi/10.1177/20531680231183512). The weakly labeled dataset contains 32,060 press releases.
|
| 56 |
The second dataset is the human-annotated dataset that is used for training partypress/partypress-multilingual. For training only the single-coded examples were used (24,117). Evaluation was performed on the data that is annotated by two human coders per example (3,121).
|
| 57 |
|
| 58 |
|