archich commited on
Commit
035c615
·
verified ·
1 Parent(s): d8668f5

Add model card

Browse files
Files changed (1) hide show
  1. README.md +105 -0
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - hi
5
+ license: mit
6
+ tags:
7
+ - text-classification
8
+ - hate-speech-detection
9
+ - xlm-roberta
10
+ - multilingual
11
+ datasets:
12
+ - hasoc2019
13
+ metrics:
14
+ - accuracy
15
+ - f1
16
+ pipeline_tag: text-classification
17
+ widget:
18
+ - text: "I love everyone in this community!"
19
+ example_title: "Positive Example"
20
+ - text: "This person is terrible and should be banned"
21
+ example_title: "Negative Example"
22
+ ---
23
+
24
+ # Hate Speech Detector (XLM-RoBERTa)
25
+
26
+ Multilingual hate speech detection model fine-tuned on HASOC 2019 dataset.
27
+
28
+ ## Model Description
29
+
30
+ This model detects hate speech in English and Hindi text using XLM-RoBERTa base as the backbone.
31
+
32
+ **Languages:** English, Hindi
33
+ **Task:** Binary Text Classification (Hate Speech / Not Hate Speech)
34
+ **Base Model:** xlm-roberta-base
35
+
36
+ ## Intended Uses
37
+
38
+ - Content moderation
39
+ - Social media monitoring
40
+ - Research purposes
41
+
42
+ ## How to Use
43
+
44
+ ```python
45
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
46
+ import torch
47
+
48
+ # Load model and tokenizer
49
+ tokenizer = AutoTokenizer.from_pretrained("archich/hate-speech-detector")
50
+ model = AutoModelForSequenceClassification.from_pretrained("archich/hate-speech-detector")
51
+
52
+ # Example text
53
+ text = "Your text here"
54
+
55
+ # Tokenize
56
+ inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=256)
57
+
58
+ # Predict
59
+ with torch.no_grad():
60
+ outputs = model(**inputs)
61
+ probs = torch.softmax(outputs.logits, dim=1)
62
+ prediction = torch.argmax(probs, dim=1).item()
63
+
64
+ labels = ["NOT_HATE_SPEECH", "HATE_SPEECH"]
65
+ print(f"Prediction: {labels[prediction]} ({probs[0][prediction].item():.2%} confidence)")
66
+ ```
67
+
68
+ ## Training Data
69
+
70
+ Trained on HASOC 2019 (Hate Speech and Offensive Content Identification) dataset containing:
71
+ - Hindi posts from social media
72
+ - English posts from social media
73
+
74
+ ## Label Mapping
75
+
76
+ - `0`: NOT_HATE_SPEECH - Normal, non-offensive content
77
+ - `1`: HATE_SPEECH - Hateful or offensive content (HOF)
78
+
79
+ ## Limitations & Ethical Considerations
80
+
81
+ ⚠️ **Important Notice:**
82
+
83
+ - This model is intended to **assist** human moderators, not replace them
84
+ - May contain biases from training data
85
+ - Context and cultural nuances are important - manual review recommended
86
+ - False positives are possible
87
+ - Should not be the sole decision-maker for content removal
88
+
89
+ ## Performance
90
+
91
+ Training details and metrics available in model files.
92
+
93
+ ## Citation
94
+
95
+ If you use this model, please cite:
96
+
97
+ ```
98
+ @misc{hate-speech-detector,
99
+ author = {archich},
100
+ title = {Multilingual Hate Speech Detector},
101
+ year = {2024},
102
+ publisher = {HuggingFace},
103
+ howpublished = {\url{https://huggingface.co/archich/hate-speech-detector}}
104
+ }
105
+ ```