Update README.md
Browse files
README.md
CHANGED
|
@@ -24,16 +24,16 @@ The distilbert_fearspeech_classifier is a fine-tuned DistilBERT model aimed at i
|
|
| 24 |
Language(s) (NLP): German
|
| 25 |
License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License
|
| 26 |
|
| 27 |
-
Model Sources
|
| 28 |
|
| 29 |
Paper: ["You are doomed!" Crisis-specific and Dynamic Use of Fear Speech in Protest and Extremist Radical Social Movements](https://doi.org/10.51685/jqd.2024.icwsm.8)
|
| 30 |
|
| 31 |
|
| 32 |
-
Uses
|
| 33 |
|
| 34 |
The model is used directly to classify Telegram posts into fear speech (FS) and non-fear speech (no FS) categories. This is particularly useful for researchers studying online radicalization and the dynamics of fear speech in social media.
|
| 35 |
|
| 36 |
-
Downstream Use
|
| 37 |
|
| 38 |
The model can be fine-tuned for specific tasks related to hate speech detection, communication studies, and social media analysis.
|
| 39 |
Out-of-Scope Use
|
|
@@ -47,7 +47,7 @@ Downstream Use
|
|
| 47 |
Users should be aware of the risks, biases, and limitations of the model. Further research and contextual understanding are recommended before using the model for critical decision-making.
|
| 48 |
How to Get Started with the Model
|
| 49 |
|
| 50 |
-
Use the following code to get started with the model
|
| 51 |
|
| 52 |
```python
|
| 53 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
|
@@ -69,4 +69,26 @@ print(predictions)
|
|
| 69 |
|
| 70 |
```
|
| 71 |
|
| 72 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
Language(s) (NLP): German
|
| 25 |
License: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License
|
| 26 |
|
| 27 |
+
**Model Sources**
|
| 28 |
|
| 29 |
Paper: ["You are doomed!" Crisis-specific and Dynamic Use of Fear Speech in Protest and Extremist Radical Social Movements](https://doi.org/10.51685/jqd.2024.icwsm.8)
|
| 30 |
|
| 31 |
|
| 32 |
+
**Direct Uses**
|
| 33 |
|
| 34 |
The model is used directly to classify Telegram posts into fear speech (FS) and non-fear speech (no FS) categories. This is particularly useful for researchers studying online radicalization and the dynamics of fear speech in social media.
|
| 35 |
|
| 36 |
+
**Downstream Use**
|
| 37 |
|
| 38 |
The model can be fine-tuned for specific tasks related to hate speech detection, communication studies, and social media analysis.
|
| 39 |
Out-of-Scope Use
|
|
|
|
| 47 |
Users should be aware of the risks, biases, and limitations of the model. Further research and contextual understanding are recommended before using the model for critical decision-making.
|
| 48 |
How to Get Started with the Model
|
| 49 |
|
| 50 |
+
**Use the following code to get started with the model:**
|
| 51 |
|
| 52 |
```python
|
| 53 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
|
|
|
| 69 |
|
| 70 |
```
|
| 71 |
|
| 72 |
+
**Training Details**
|
| 73 |
+
**Training Data**
|
| 74 |
+
|
| 75 |
+
The model was trained on a dataset of manually annotated Telegram posts from radical and extremist groups. The dataset includes posts related to six crisis-specific topics: COVID-19, Conspiracy Narratives, Russian Invasion of Ukraine (RioU), Energy Crisis, Inflation, and Migration.
|
| 76 |
+
Training Procedure
|
| 77 |
+
Preprocessing
|
| 78 |
+
|
| 79 |
+
Data cleaning involved removing emojis, numbers, and hyperlinks.
|
| 80 |
+
Posts shorter than ten characters and longer than 1000 characters were excluded.
|
| 81 |
+
|
| 82 |
+
The evaluation metrics include precision, recall, and F1-score, which are essential for understanding the model's performance in classifying fear speech.
|
| 83 |
+
Results
|
| 84 |
+
|
| 85 |
+
Validation Set Precision: 0.82
|
| 86 |
+
Validation Set Recall: 0.82
|
| 87 |
+
Validation Set F1-Score: 0.82
|
| 88 |
+
Test Set Precision: 0.79
|
| 89 |
+
Test Set Recall: 0.79
|
| 90 |
+
Test Set F1-Score: 0.79
|
| 91 |
+
|
| 92 |
+
Summary
|
| 93 |
+
|
| 94 |
+
The model demonstrated robust performance with balanced precision and recall metrics above 0.76.
|