PatrickSchwabl
/

distilbert_fearspeech_classifier

@@ -24,16 +24,16 @@ The distilbert_fearspeech_classifier is a fine-tuned DistilBERT model aimed at i
     Language(s) (NLP): German
     License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License
-Model Sources
 Paper: ["You are doomed!" Crisis-specific and Dynamic Use of Fear Speech in Protest and Extremist Radical Social Movements](https://doi.org/10.51685/jqd.2024.icwsm.8)
-Uses
   The model is used directly to classify Telegram posts into fear speech (FS) and non-fear speech (no FS) categories. This is particularly useful for researchers studying online radicalization and the dynamics of fear speech in social media.
-Downstream Use
   The model can be fine-tuned for specific tasks related to hate speech detection, communication studies, and social media analysis.
   Out-of-Scope Use
@@ -47,7 +47,7 @@ Downstream Use
   Users should be aware of the risks, biases, and limitations of the model. Further research and contextual understanding are recommended before using the model for critical decision-making.
   How to Get Started with the Model
-Use the following code to get started with the model:
 ```python
 from transformers import AutoTokenizer, AutoModelForSequenceClassification
@@ -69,4 +69,26 @@ print(predictions)
 ```

     Language(s) (NLP): German
     License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License
+**Model Sources**
 Paper: ["You are doomed!" Crisis-specific and Dynamic Use of Fear Speech in Protest and Extremist Radical Social Movements](https://doi.org/10.51685/jqd.2024.icwsm.8)
+**Direct Uses**
   The model is used directly to classify Telegram posts into fear speech (FS) and non-fear speech (no FS) categories. This is particularly useful for researchers studying online radicalization and the dynamics of fear speech in social media.
+**Downstream Use**
   The model can be fine-tuned for specific tasks related to hate speech detection, communication studies, and social media analysis.
   Out-of-Scope Use
   Users should be aware of the risks, biases, and limitations of the model. Further research and contextual understanding are recommended before using the model for critical decision-making.
   How to Get Started with the Model
+**Use the following code to get started with the model:**
 ```python
 from transformers import AutoTokenizer, AutoModelForSequenceClassification
 ```
+**Training Details**
+**Training Data**
+The model was trained on a dataset of manually annotated Telegram posts from radical and extremist groups. The dataset includes posts related to six crisis-specific topics: COVID-19, Conspiracy Narratives, Russian Invasion of Ukraine (RioU), Energy Crisis, Inflation, and Migration.
+Training Procedure
+Preprocessing
+    Data cleaning involved removing emojis, numbers, and hyperlinks.
+    Posts shorter than ten characters and longer than 1000 characters were excluded.
+The evaluation metrics include precision, recall, and F1-score, which are essential for understanding the model's performance in classifying fear speech.
+Results
+    Validation Set Precision: 0.82
+    Validation Set Recall: 0.82
+    Validation Set F1-Score: 0.82
+    Test Set Precision: 0.79
+    Test Set Recall: 0.79
+    Test Set F1-Score: 0.79
+Summary
+The model demonstrated robust performance with balanced precision and recall metrics above 0.76.