archich
/

hate-speech-detector

Text Classification

hate-speech-detection

Model card Files Files and versions

archich commited on Nov 11, 2025

Commit

035c615

·

verified ·

1 Parent(s): d8668f5

Add model card

Files changed (1) hide show

README.md +105 -0

README.md ADDED Viewed

	@@ -0,0 +1,105 @@

+---
+language:
+- en
+- hi
+license: mit
+tags:
+- text-classification
+- hate-speech-detection
+- xlm-roberta
+- multilingual
+datasets:
+- hasoc2019
+metrics:
+- accuracy
+- f1
+pipeline_tag: text-classification
+widget:
+- text: "I love everyone in this community!"
+  example_title: "Positive Example"
+- text: "This person is terrible and should be banned"
+  example_title: "Negative Example"
+---
+# Hate Speech Detector (XLM-RoBERTa)
+Multilingual hate speech detection model fine-tuned on HASOC 2019 dataset.
+## Model Description
+This model detects hate speech in English and Hindi text using XLM-RoBERTa base as the backbone.
+**Languages:** English, Hindi
+**Task:** Binary Text Classification (Hate Speech / Not Hate Speech)
+**Base Model:** xlm-roberta-base
+## Intended Uses
+- Content moderation
+- Social media monitoring
+- Research purposes
+## How to Use
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+# Load model and tokenizer
+tokenizer = AutoTokenizer.from_pretrained("archich/hate-speech-detector")
+model = AutoModelForSequenceClassification.from_pretrained("archich/hate-speech-detector")
+# Example text
+text = "Your text here"
+# Tokenize
+inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=256)
+# Predict
+with torch.no_grad():
+    outputs = model(**inputs)
+    probs = torch.softmax(outputs.logits, dim=1)
+    prediction = torch.argmax(probs, dim=1).item()
+labels = ["NOT_HATE_SPEECH", "HATE_SPEECH"]
+print(f"Prediction: {labels[prediction]} ({probs[0][prediction].item():.2%} confidence)")
+```
+## Training Data
+Trained on HASOC 2019 (Hate Speech and Offensive Content Identification) dataset containing:
+- Hindi posts from social media
+- English posts from social media
+## Label Mapping
+- `0`: NOT_HATE_SPEECH - Normal, non-offensive content
+- `1`: HATE_SPEECH - Hateful or offensive content (HOF)
+## Limitations & Ethical Considerations
+⚠️ **Important Notice:**
+- This model is intended to **assist** human moderators, not replace them
+- May contain biases from training data
+- Context and cultural nuances are important - manual review recommended
+- False positives are possible
+- Should not be the sole decision-maker for content removal
+## Performance
+Training details and metrics available in model files.
+## Citation
+If you use this model, please cite:
+```
+@misc{hate-speech-detector,
+  author = {archich},
+  title = {Multilingual Hate Speech Detector},
+  year = {2024},
+  publisher = {HuggingFace},
+  howpublished = {\url{https://huggingface.co/archich/hate-speech-detector}}
+}
+```