Multilingual Hate Speech Detection — GloVe + BiLSTM (v2)
Task: Binary text classification (Hate / Non-Hate) Languages: English, Hindi, Hinglish (Hindi-English code-mixed) Architecture: Bidirectional LSTM with frozen GloVe embeddings Strategy: Hinglish → Hindi → English → Full (50 epochs per phase, 200 total)
Table of Contents
- What This Experiment Does
- The Dataset
- Model Architecture
- Training Strategy
- Phase 1 — Hinglish
- Phase 2 — Hindi
- Phase 3 — English
- Phase 4 — Full Dataset
- Full Results Table
- How to Use
1. What This Experiment Does
This is v2 of the SASC sequential transfer learning experiment.
v1 ran all 6 permutations of [English, Hindi, Hinglish] with 8 epochs per phase. v2 focuses on a single strategy — Hinglish → Hindi → English → Full — but trains for 50 epochs per phase (200 total). The key new addition: after every phase the model is evaluated on all three individual language test sets AND the full test set, giving a complete 4×4 cross-evaluation matrix showing how knowledge transfers across languages.
2. The Dataset
Dataset: tuklu/nprism
| Split | Samples |
|---|---|
| Train | 17,704 |
| Validation | 2,950 |
| Test | 8,852 |
| Total | 29,505 |
| Language | Count | % |
|---|---|---|
| English | 14,994 | 50.8% |
| Hindi | 9,738 | 33.0% |
| Hinglish | 4,774 | 16.2% |
| Label | Count | % |
|---|---|---|
| Non-Hate (0) | 15,799 | 53.5% |
| Hate (1) | 13,707 | 46.5% |
The dataset is dominated by English (50.8%). GloVe embeddings are also English-centric, which directly explains why the English phase produces the sharpest accuracy jump regardless of training order.
3. Model Architecture
Input: Text sequence (max 100 tokens)
↓
GloVe Embedding Layer (vocab: 50,000 × 300d) — FROZEN
↓
Bidirectional LSTM (128 units)
→ reads sentence left-to-right AND right-to-left
↓
Dropout (0.5)
↓
Dense Layer (64 neurons, ReLU)
↓
Output Layer (1 neuron, Sigmoid)
→ > 0.5 = Hate Speech | ≤ 0.5 = Non-Hate
- Optimizer: Adam
- Loss: Binary Cross-Entropy
- Max sequence length: 100 tokens
- Vocab size: 50,000
4. Training Strategy
| Phase | Training Data | Epochs | Batch Size | Samples |
|---|---|---|---|---|
| 1 — Hinglish | Hinglish subset | 50 | 32 | ~2,908 |
| 2 — Hindi | Hindi subset | 50 | 32 | ~5,940 |
| 3 — English | English subset | 50 | 32 | ~8,856 |
| 4 — Full | All shuffled | 50 | 64 | 17,704 |
The same model carries its weights through all 4 phases — no resets between languages. After each phase the model is evaluated against all three language-specific test sets and the full test set.
5. Phase 1 — Hinglish
Training on Hinglish only (2,908 samples, 50 epochs). The model starts cold. Hinglish is code-mixed and GloVe has limited coverage — the model learns from sequential patterns rather than word semantics.
Evaluation after Phase 1
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|
| Hinglish | 0.6688 | 0.6378 | 0.6058 | 0.4848 | 0.7908 | 0.5386 | 0.6579 |
| Hindi | 0.4493 | 0.5000 | 0.4493 | 1.0000 | 0.0000 | 0.6200 | 0.5234 |
| English | 0.5171 | 0.5125 | 0.5738 | 0.0916 | 0.9334 | 0.1580 | 0.5620 |
| Full | 0.5190 | 0.5133 | 0.4803 | 0.4331 | 0.5935 | 0.4555 | 0.5243 |
The Hindi result (Recall=1.0, Specificity=0.0) shows the model predicts everything as hate on Hindi — it has no Hindi-specific knowledge yet. English performance is near-random. Hinglish F1=0.539 shows the model has learned something useful from its own language.
6. Phase 2 — Hindi
Training on Hindi (5,940 samples, 50 epochs). GloVe has limited Hindi coverage so the model must rely on contextual patterns. The struggle here is deliberate — it builds language-agnostic hate detection features.
Evaluation after Phase 2
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|
| Hinglish | 0.5409 | 0.4885 | 0.3761 | 0.2299 | 0.7470 | 0.2854 | 0.4771 |
| Hindi | 0.5834 | 0.5730 | 0.5420 | 0.4705 | 0.6756 | 0.5037 | 0.5949 |
| English | 0.4711 | 0.4744 | 0.4789 | 0.7878 | 0.1611 | 0.5957 | 0.4292 |
| Full | 0.5190 | 0.5251 | 0.4859 | 0.6111 | 0.4390 | 0.5414 | 0.5255 |
Hindi F1 improves to 0.504. Hinglish drops — the model has partially overwritten Hinglish-specific patterns. English recall spikes (high false positives) showing the model is now biased toward predicting hate. This is the expected "catastrophic interference" that the Full phase resolves.
7. Phase 3 — English
Training on English (8,856 samples, 50 epochs). This is the turning point. GloVe embeddings align well with English — the model jumps sharply and the English-phase knowledge partially generalises back to the other languages.
Evaluation after Phase 3
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|
| Hinglish | 0.4115 | 0.4938 | 0.3955 | 0.9002 | 0.0875 | 0.5495 | 0.4572 |
| Hindi | 0.5424 | 0.5399 | 0.4912 | 0.5150 | 0.5648 | 0.5028 | 0.5377 |
| English | 0.7721 | 0.7726 | 0.7453 | 0.8190 | 0.7262 | 0.7804 | 0.8458 |
| Full | 0.6395 | 0.6458 | 0.5901 | 0.7337 | 0.5578 | 0.6541 | 0.6913 |
English F1 leaps to 0.780 — the model now performs strongly on its native language. Full AUC reaches 0.691. Hinglish specificity collapses again (high recall, low precision) — the model over-predicts hate on unseen languages after English fine-tuning.
8. Phase 4 — Full Dataset
Training on the full shuffled dataset (17,704 samples, 50 epochs). This consolidation phase exposes the model to all three languages simultaneously, balancing out the per-language biases accumulated during sequential training.
Evaluation after Phase 4 (Final Model)
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|
| Hinglish | 0.6326 | 0.6101 | 0.5426 | 0.4991 | 0.7210 | 0.5200 | 0.6161 |
| Hindi | 0.5748 | 0.5676 | 0.5286 | 0.4958 | 0.6393 | 0.5117 | 0.5941 |
| English | 0.7747 | 0.7746 | 0.7747 | 0.7678 | 0.7815 | 0.7712 | 0.8476 |
| Full | 0.6866 | 0.6839 | 0.6687 | 0.6449 | 0.7228 | 0.6566 | 0.7556 |
The Full phase restores balance across all languages. Hinglish specificity recovers to 0.721 (from 0.088 after English phase). Full-dataset AUC reaches 0.756 — the best of all phases. English performance is preserved at F1=0.771 while Hinglish and Hindi both improve substantially from their post-English-phase collapse.
9. Full Results Table
Complete 16-row cross-evaluation (Phase × Eval Language):
| Phase | Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|---|
| hinglish | hinglish | 0.6688 | 0.6378 | 0.6058 | 0.4848 | 0.7908 | 0.5386 | 0.6579 |
| hinglish | hindi | 0.4493 | 0.5000 | 0.4493 | 1.0000 | 0.0000 | 0.6200 | 0.5234 |
| hinglish | english | 0.5171 | 0.5125 | 0.5738 | 0.0916 | 0.9334 | 0.1580 | 0.5620 |
| hinglish | full | 0.5190 | 0.5133 | 0.4803 | 0.4331 | 0.5935 | 0.4555 | 0.5243 |
| hindi | hinglish | 0.5409 | 0.4885 | 0.3761 | 0.2299 | 0.7470 | 0.2854 | 0.4771 |
| hindi | hindi | 0.5834 | 0.5730 | 0.5420 | 0.4705 | 0.6756 | 0.5037 | 0.5949 |
| hindi | english | 0.4711 | 0.4744 | 0.4789 | 0.7878 | 0.1611 | 0.5957 | 0.4292 |
| hindi | full | 0.5190 | 0.5251 | 0.4859 | 0.6111 | 0.4390 | 0.5414 | 0.5255 |
| english | hinglish | 0.4115 | 0.4938 | 0.3955 | 0.9002 | 0.0875 | 0.5495 | 0.4572 |
| english | hindi | 0.5424 | 0.5399 | 0.4912 | 0.5150 | 0.5648 | 0.5028 | 0.5377 |
| english | english | 0.7721 | 0.7726 | 0.7453 | 0.8190 | 0.7262 | 0.7804 | 0.8458 |
| english | full | 0.6395 | 0.6458 | 0.5901 | 0.7337 | 0.5578 | 0.6541 | 0.6913 |
| Full | hinglish | 0.6326 | 0.6101 | 0.5426 | 0.4991 | 0.7210 | 0.5200 | 0.6161 |
| Full | hindi | 0.5748 | 0.5676 | 0.5286 | 0.4958 | 0.6393 | 0.5117 | 0.5941 |
| Full | english | 0.7747 | 0.7746 | 0.7747 | 0.7678 | 0.7815 | 0.7712 | 0.8476 |
| Full | full | 0.6866 | 0.6839 | 0.6687 | 0.6449 | 0.7228 | 0.6566 | 0.7556 |
Key Observations
- English phase is the sharpest turning point — English F1 jumps from 0.596 (after Hindi) to 0.780 in one phase, driven by GloVe's English-centric embeddings.
- Starting from Hinglish forces generalisation from noise — the model reaches Hinglish F1=0.539 after only its own phase, a stronger start than Hinglish gets in most v1 orderings.
- Catastrophic interference is visible — Hinglish specificity drops from 0.791 → 0.747 → 0.088 as the model progressively shifts language bias. The Full phase restores it to 0.721.
- Final Full phase AUC = 0.756 matches the best v1 strategies despite a harder starting language, confirming the robustness of the Hinglish-first approach with deeper training.
- Hindi remains the hardest (F1=0.512 at final) — consistent with GloVe's limited Hindi vocabulary coverage.
10. How to Use
import json
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import tokenizer_from_json
from tensorflow.keras.preprocessing.sequence import pad_sequences
from huggingface_hub import hf_hub_download
# Load tokenizer (from v1 repo — same dataset/split)
tokenizer_path = hf_hub_download(repo_id="tuklu/SASC", filename="tokenizer.json")
with open(tokenizer_path) as f:
tokenizer = tokenizer_from_json(f.read())
# Load model
model_path = hf_hub_download(repo_id="tuklu/SASCv2", filename="model.h5")
model = tf.keras.models.load_model(model_path)
# Predict
texts = ["I hate all of them", "Have a great day!"]
sequences = tokenizer.texts_to_sequences(texts)
padded = pad_sequences(sequences, maxlen=100)
probs = model.predict(padded).flatten()
for text, prob in zip(texts, probs):
label = "Hate Speech" if prob > 0.5 else "Non-Hate"
print(f"{label} ({prob:.3f}): {text}")
Explainability — SHAP Analysis
We applied SHAP (SHapley Additive exPlanations) to the final trained model to understand which words drive hate speech predictions. A GradientExplainer runs on the BiLSTM sub-model (embedding layer bypassed — embeddings pre-computed as floats), with 200 background training samples, evaluated on all 4 test sets.
Full methodology, all strategy comparisons, and detailed word tables: SHAP_REPORT.md
Top SHAP Words — Final Model
| Eval | Top Hate Words | Top Non-Hate Words |
|---|---|---|
| English | nas, fags, sicko, sabotage, advocating | grow, barrel, homosexual, pak, join |
| Hindi | वादा, वैज्ञानिकों, ऐ, उतारा, गला | जीतेगा, घोंटने, जिहादी, आपत्तिजनक |
| Hinglish | arey, bahir, punish, papa, interior | online, member, mam, messages, asha |
| Full | blamed, criticized, syntax, grown, sine | underneath, smack, online, hole, clue |
Key Takeaways
- Hindi SHAP values are 10× smaller than English/Hinglish — GloVe has near-zero Hindi coverage; model relies on positional patterns, not word semantics
- Accusatory framing dominates full-dataset hate markers (
blamed,criticized,advocating) — the 50-epoch Full phase learns that hate speech in this corpus often targets victims through blame/accusation rather than direct slurs - "online" is the most consistent non-hate signal — informational/conversational context across all three languages
- Hinglish markers are semantically coherent (
arey= hey/exclamation in abusive context,punish,interior) despite code-mixing — v2's 50 epochs on Hinglish-first produced stronger Hinglish feature learning than v1 - Spurious correlations remain (
syntax,sine) — inherent limitation of non-contextual GloVe; a BERT-based model would resolve these
Related
- v1 (all 6 strategies, 8 epochs each): tuklu/SASC
- Dataset: tuklu/nprism
Citation
@misc{sasc2026,
title={Multilingual Hate Speech Detection via Sequential Transfer Learning (v2)},
author={tuklu},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/tuklu/SASCv2}
}
Dataset used to train tuklu/SASCv2
Evaluation results
- F1 Score (Full Phase — Full Test) on nprismself-reported0.657
- Accuracy (Full Phase — Full Test) on nprismself-reported0.687
- ROC-AUC (Full Phase — Full Test) on nprismself-reported0.756








































































