sadjava's picture
πŸ›‘οΈ Multilingual Hate Speech Detector
00ab3ee
---
title: Multilingual Hate Speech Detector
emoji: πŸ›‘οΈ
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
short_description: Hate speech detector
models:
- xlm-roberta-base
datasets:
- hate-speech
---
# πŸ›‘οΈ Multilingual Hate Speech Detector
**Advanced AI system for detecting hate speech in English and Serbian text with innovative contextual analysis**
## πŸ”¬ Key Innovations
### 1. **Contextual Analysis** 🌈
- **Word-level importance highlighting** using transformer attention weights
- Visual explanation showing which words most influenced the classification decision
- Color-coded highlighting: πŸ”΄ Red (high influence) β†’ 🟠 Orange β†’ 🟑 Yellow β†’ βšͺ Gray (low influence)
### 2. **Confidence Visualization** πŸ“Š
- Interactive Plotly charts showing model confidence across **all 8 categories**
- Real-time confidence distribution analysis
- Color-coded bars distinguishing hate speech categories from appropriate content
### 3. **Interactive Feedback System** πŸ’¬
- User rating system (1-5 stars) for continuous model improvement
- Feedback collection for enhancing accuracy
- Community-driven model refinement
## πŸ“‹ Hate Speech Categories
The system detects 8 categories:
- **Race**: Racial discrimination and slurs
- **Sexual Orientation**: Homophobic content, LGBTQ+ discrimination
- **Gender**: Sexist content, misogyny, gender-based harassment
- **Physical Appearance**: Body shaming, lookism, appearance-based harassment
- **Religion**: Religious discrimination, islamophobia, antisemitism
- **Class**: Classist content, economic discrimination
- **Disability**: Ableist content, discrimination against disabled people
- **Appropriate**: Non-hateful, normal conversation
## 🌍 Multilingual Support
- **English**: Comprehensive hate speech detection
- **Serbian**: Native Serbian language support with Cyrillic and Latin scripts
- **Cross-lingual**: XLM-RoBERTa architecture enables robust multilingual understanding
## πŸ”§ Technical Architecture
- **Base Model**: XLM-RoBERTa (Cross-lingual Language Model)
- **Training**: Fine-tuned on multilingual hate speech datasets
- **Attention Mechanism**: Transformer attention weights for explainable AI
- **Real-time Processing**: Optimized for instant classification
- **GPU Acceleration**: CUDA support for faster inference
## πŸš€ How to Use
1. **Input Text**: Enter any text in English or Serbian
2. **Analyze**: Click "Analyze Text" for instant classification
3. **Review Results**: See category prediction with confidence score
4. **Examine Context**: Check word-level highlighting to understand the decision
5. **View Confidence**: Analyze the confidence distribution chart
6. **Provide Feedback**: Rate the analysis to help improve the model
## 🎯 Example Analyses
### Appropriate Content
```
"I really enjoyed that movie last night! Great acting and storyline."
β†’ βœ… Appropriate (95% confidence)
```
### Hate Speech Detection
```
"You people are all the same, always causing problems everywhere."
β†’ ⚠️ Race (87% confidence)
```
### Serbian Language
```
"Ovaj film je bio odličan, preporučujem svima!"
β†’ βœ… Appropriate (92% confidence)
```
## ⚑ Performance
- **Accuracy**: High-confidence predictions with detailed explanations
- **Speed**: Real-time processing (< 2 seconds per analysis)
- **Languages**: English and Serbian with cross-lingual capabilities
- **Explainability**: Visual attention analysis for transparent decisions
## πŸ› οΈ Local Development
```bash
# Clone the repository
git clone <repository-url>
cd hate-speech-detector
# Install dependencies
pip install -r requirements.txt
# Run the application
python app.py
```
## πŸ“ Research & Education
This AI system is designed for:
- **Research purposes**: Understanding hate speech patterns
- **Educational use**: Learning about AI explainability
- **Content moderation**: Assisting human moderators
- **Linguistic analysis**: Cross-lingual hate speech research
## ⚠️ Important Notes
- Results should be interpreted carefully
- Human judgment should always be applied for critical decisions
- The system is designed to assist, not replace, human moderation
- Continuous improvement through user feedback
## 🀝 Contributing
We welcome feedback and contributions! Please use the interactive feedback system within the application to help improve model accuracy.
## πŸ“„ License
MIT License - See LICENSE file for details
---
**⚑ Powered by**: Transformer Neural Networks | **🌍 Languages**: English, Serbian | **🎯 Focus**: Explainable AI