|
|
--- |
|
|
language: |
|
|
- en |
|
|
- hi |
|
|
license: mit |
|
|
tags: |
|
|
- text-classification |
|
|
- hate-speech-detection |
|
|
- xlm-roberta |
|
|
- multilingual |
|
|
datasets: |
|
|
- hasoc2019 |
|
|
metrics: |
|
|
- accuracy |
|
|
- f1 |
|
|
pipeline_tag: text-classification |
|
|
widget: |
|
|
- text: I love everyone in this community! |
|
|
example_title: Positive Example |
|
|
- text: This person is terrible and should be banned |
|
|
example_title: Negative Example |
|
|
--- |
|
|
|
|
|
# Hate Speech Detector (XLM-RoBERTa) |
|
|
|
|
|
Multilingual hate speech detection model fine-tuned on HASOC 2019 dataset. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model detects hate speech in English and Hindi text using XLM-RoBERTa base as the backbone. |
|
|
|
|
|
**Languages:** English, Hindi |
|
|
**Task:** Binary Text Classification (Hate Speech / Not Hate Speech) |
|
|
**Base Model:** xlm-roberta-base |
|
|
|
|
|
## Intended Uses |
|
|
|
|
|
- Content moderation |
|
|
- Social media monitoring |
|
|
- Research purposes |
|
|
|
|
|
## How to Use |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("archich/hate-speech-detector") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("archich/hate-speech-detector") |
|
|
|
|
|
# Example text |
|
|
text = "Your text here" |
|
|
|
|
|
# Tokenize |
|
|
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=256) |
|
|
|
|
|
# Predict |
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
probs = torch.softmax(outputs.logits, dim=1) |
|
|
prediction = torch.argmax(probs, dim=1).item() |
|
|
|
|
|
labels = ["NOT_HATE_SPEECH", "HATE_SPEECH"] |
|
|
print(f"Prediction: {labels[prediction]} ({probs[0][prediction].item():.2%} confidence)") |
|
|
``` |
|
|
|
|
|
## Training Data |
|
|
|
|
|
Trained on HASOC 2019 (Hate Speech and Offensive Content Identification) dataset containing: |
|
|
- Hindi posts from social media |
|
|
- English posts from social media |
|
|
|
|
|
## Label Mapping |
|
|
|
|
|
- `0`: NOT_HATE_SPEECH - Normal, non-offensive content |
|
|
- `1`: HATE_SPEECH - Hateful or offensive content (HOF) |
|
|
|
|
|
## Limitations & Ethical Considerations |
|
|
|
|
|
⚠️ **Important Notice:** |
|
|
|
|
|
- This model is intended to **assist** human moderators, not replace them |
|
|
- May contain biases from training data |
|
|
- Context and cultural nuances are important - manual review recommended |
|
|
- False positives are possible |
|
|
- Should not be the sole decision-maker for content removal |
|
|
|
|
|
## Performance |
|
|
|
|
|
Training details and metrics available in model files. |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
``` |
|
|
@misc{hate-speech-detector, |
|
|
author = {archich}, |
|
|
title = {Multilingual Hate Speech Detector}, |
|
|
year = {2024}, |
|
|
publisher = {HuggingFace}, |
|
|
howpublished = {\url{https://huggingface.co/archich/hate-speech-detector}} |
|
|
} |
|
|
``` |
|
|
|