File size: 20,612 Bytes
538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a 9989809 538649a dfa11db 538649a 9989809 538649a 9989809 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 | ---
language:
- en
- hi
tags:
- hate-speech
- text-classification
- bilstm
- glove
- multilingual
- transfer-learning
- hinglish
- sequential-learning
datasets:
- tuklu/nprism
license: mit
model-index:
- name: hate-speech-multilingual-bilstm-v2
results:
- task:
type: text-classification
name: Hate Speech Detection
dataset:
name: nprism
type: tuklu/nprism
metrics:
- type: f1
value: 0.6566
name: F1 Score (Full Phase — Full Test)
- type: accuracy
value: 0.6866
name: Accuracy (Full Phase — Full Test)
- type: roc_auc
value: 0.7556
name: ROC-AUC (Full Phase — Full Test)
---
# Multilingual Hate Speech Detection — GloVe + BiLSTM (v2)
**Task:** Binary text classification (Hate / Non-Hate)
**Languages:** English, Hindi, Hinglish (Hindi-English code-mixed)
**Architecture:** Bidirectional LSTM with frozen GloVe embeddings
**Strategy:** Hinglish → Hindi → English → Full (50 epochs per phase, 200 total)
---
## Table of Contents
1. [What This Experiment Does](#1-what-this-experiment-does)
2. [The Dataset](#2-the-dataset)
3. [Model Architecture](#3-model-architecture)
4. [Training Strategy](#4-training-strategy)
5. [Phase 1 — Hinglish](#5-phase-1--hinglish)
6. [Phase 2 — Hindi](#6-phase-2--hindi)
7. [Phase 3 — English](#7-phase-3--english)
8. [Phase 4 — Full Dataset](#8-phase-4--full-dataset)
9. [Full Results Table](#9-full-results-table)
10. [How to Use](#10-how-to-use)
---
## 1. What This Experiment Does
This is **v2** of the SASC sequential transfer learning experiment.
v1 ran all 6 permutations of [English, Hindi, Hinglish] with **8 epochs** per phase. v2 focuses on a single strategy — `Hinglish → Hindi → English → Full` — but trains for **50 epochs per phase (200 total)**. The key new addition: after every phase the model is evaluated on **all three individual language test sets AND the full test set**, giving a complete 4×4 cross-evaluation matrix showing how knowledge transfers across languages.
---
## 2. The Dataset
Dataset: [tuklu/nprism](https://huggingface.co/datasets/tuklu/nprism)
| Split | Samples |
|---|---|
| Train | 17,704 |
| Validation | 2,950 |
| Test | 8,852 |
| **Total** | **29,505** |
| Language | Count | % |
|---|---|---|
| English | 14,994 | 50.8% |
| Hindi | 9,738 | 33.0% |
| Hinglish | 4,774 | 16.2% |
| Label | Count | % |
|---|---|---|
| Non-Hate (0) | 15,799 | 53.5% |
| Hate (1) | 13,707 | 46.5% |

The dataset is dominated by English (50.8%). GloVe embeddings are also English-centric, which directly explains why the English phase produces the sharpest accuracy jump regardless of training order.
---
## 3. Model Architecture
```
Input: Text sequence (max 100 tokens)
↓
GloVe Embedding Layer (vocab: 50,000 × 300d) — FROZEN
↓
Bidirectional LSTM (128 units)
→ reads sentence left-to-right AND right-to-left
↓
Dropout (0.5)
↓
Dense Layer (64 neurons, ReLU)
↓
Output Layer (1 neuron, Sigmoid)
→ > 0.5 = Hate Speech | ≤ 0.5 = Non-Hate
```
- **Optimizer:** Adam
- **Loss:** Binary Cross-Entropy
- **Max sequence length:** 100 tokens
- **Vocab size:** 50,000
---
## 4. Training Strategy
| Phase | Training Data | Epochs | Batch Size | Samples |
|---|---|---|---|---|
| 1 — Hinglish | Hinglish subset | 50 | 32 | ~2,908 |
| 2 — Hindi | Hindi subset | 50 | 32 | ~5,940 |
| 3 — English | English subset | 50 | 32 | ~8,856 |
| 4 — Full | All shuffled | 50 | 64 | 17,704 |
The **same model** carries its weights through all 4 phases — no resets between languages. After each phase the model is evaluated against all three language-specific test sets and the full test set.
---
## 5. Phase 1 — Hinglish
**Training on Hinglish only** (2,908 samples, 50 epochs). The model starts cold. Hinglish is code-mixed and GloVe has limited coverage — the model learns from sequential patterns rather than word semantics.

### Evaluation after Phase 1
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|
| Hinglish | 0.6688 | 0.6378 | 0.6058 | 0.4848 | 0.7908 | 0.5386 | 0.6579 |
| Hindi | 0.4493 | 0.5000 | 0.4493 | 1.0000 | 0.0000 | 0.6200 | 0.5234 |
| English | 0.5171 | 0.5125 | 0.5738 | 0.0916 | 0.9334 | 0.1580 | 0.5620 |
| Full | 0.5190 | 0.5133 | 0.4803 | 0.4331 | 0.5935 | 0.4555 | 0.5243 |
The Hindi result (Recall=1.0, Specificity=0.0) shows the model predicts **everything as hate** on Hindi — it has no Hindi-specific knowledge yet. English performance is near-random. Hinglish F1=0.539 shows the model has learned something useful from its own language.
| Eval On | Confusion Matrix | ROC | Precision-Recall | F1 vs Threshold |
|---|---|---|---|---|
| Hinglish |  |  |  |  |
| Hindi |  |  |  |  |
| English |  |  |  |  |
| Full |  |  |  |  |
---
## 6. Phase 2 — Hindi
**Training on Hindi** (5,940 samples, 50 epochs). GloVe has limited Hindi coverage so the model must rely on contextual patterns. The struggle here is deliberate — it builds language-agnostic hate detection features.

### Evaluation after Phase 2
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|
| Hinglish | 0.5409 | 0.4885 | 0.3761 | 0.2299 | 0.7470 | 0.2854 | 0.4771 |
| Hindi | 0.5834 | 0.5730 | 0.5420 | 0.4705 | 0.6756 | 0.5037 | 0.5949 |
| English | 0.4711 | 0.4744 | 0.4789 | 0.7878 | 0.1611 | 0.5957 | 0.4292 |
| Full | 0.5190 | 0.5251 | 0.4859 | 0.6111 | 0.4390 | 0.5414 | 0.5255 |
Hindi F1 improves to 0.504. Hinglish drops — the model has partially overwritten Hinglish-specific patterns. English recall spikes (high false positives) showing the model is now biased toward predicting hate. This is the expected "catastrophic interference" that the Full phase resolves.
| Eval On | Confusion Matrix | ROC | Precision-Recall | F1 vs Threshold |
|---|---|---|---|---|
| Hinglish |  |  |  |  |
| Hindi |  |  |  |  |
| English |  |  |  |  |
| Full |  |  |  |  |
---
## 7. Phase 3 — English
**Training on English** (8,856 samples, 50 epochs). This is the turning point. GloVe embeddings align well with English — the model jumps sharply and the English-phase knowledge partially generalises back to the other languages.

### Evaluation after Phase 3
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|
| Hinglish | 0.4115 | 0.4938 | 0.3955 | 0.9002 | 0.0875 | 0.5495 | 0.4572 |
| Hindi | 0.5424 | 0.5399 | 0.4912 | 0.5150 | 0.5648 | 0.5028 | 0.5377 |
| **English** | **0.7721** | **0.7726** | **0.7453** | **0.8190** | **0.7262** | **0.7804** | **0.8458** |
| Full | 0.6395 | 0.6458 | 0.5901 | 0.7337 | 0.5578 | 0.6541 | 0.6913 |
English F1 leaps to 0.780 — the model now performs strongly on its native language. Full AUC reaches 0.691. Hinglish specificity collapses again (high recall, low precision) — the model over-predicts hate on unseen languages after English fine-tuning.
| Eval On | Confusion Matrix | ROC | Precision-Recall | F1 vs Threshold |
|---|---|---|---|---|
| Hinglish |  |  |  |  |
| Hindi |  |  |  |  |
| English |  |  |  |  |
| Full |  |  |  |  |
---
## 8. Phase 4 — Full Dataset
**Training on the full shuffled dataset** (17,704 samples, 50 epochs). This consolidation phase exposes the model to all three languages simultaneously, balancing out the per-language biases accumulated during sequential training.

### Evaluation after Phase 4 (Final Model)
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|
| Hinglish | 0.6326 | 0.6101 | 0.5426 | 0.4991 | 0.7210 | 0.5200 | 0.6161 |
| Hindi | 0.5748 | 0.5676 | 0.5286 | 0.4958 | 0.6393 | 0.5117 | 0.5941 |
| **English** | **0.7747** | **0.7746** | **0.7747** | **0.7678** | **0.7815** | **0.7712** | **0.8476** |
| **Full** | **0.6866** | **0.6839** | **0.6687** | **0.6449** | **0.7228** | **0.6566** | **0.7556** |
The Full phase restores balance across all languages. Hinglish specificity recovers to 0.721 (from 0.088 after English phase). Full-dataset AUC reaches **0.756** — the best of all phases. English performance is preserved at F1=0.771 while Hinglish and Hindi both improve substantially from their post-English-phase collapse.
| Eval On | Confusion Matrix | ROC | Precision-Recall | F1 vs Threshold |
|---|---|---|---|---|
| Hinglish |  |  |  |  |
| Hindi |  |  |  |  |
| English |  |  |  |  |
| Full |  |  |  |  |
---
## 9. Full Results Table
Complete 16-row cross-evaluation (Phase × Eval Language):
| Phase | Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|---|---|---|---|---|---|---|---|---|
| hinglish | hinglish | 0.6688 | 0.6378 | 0.6058 | 0.4848 | 0.7908 | 0.5386 | 0.6579 |
| hinglish | hindi | 0.4493 | 0.5000 | 0.4493 | 1.0000 | 0.0000 | 0.6200 | 0.5234 |
| hinglish | english | 0.5171 | 0.5125 | 0.5738 | 0.0916 | 0.9334 | 0.1580 | 0.5620 |
| hinglish | full | 0.5190 | 0.5133 | 0.4803 | 0.4331 | 0.5935 | 0.4555 | 0.5243 |
| hindi | hinglish | 0.5409 | 0.4885 | 0.3761 | 0.2299 | 0.7470 | 0.2854 | 0.4771 |
| hindi | hindi | 0.5834 | 0.5730 | 0.5420 | 0.4705 | 0.6756 | 0.5037 | 0.5949 |
| hindi | english | 0.4711 | 0.4744 | 0.4789 | 0.7878 | 0.1611 | 0.5957 | 0.4292 |
| hindi | full | 0.5190 | 0.5251 | 0.4859 | 0.6111 | 0.4390 | 0.5414 | 0.5255 |
| english | hinglish | 0.4115 | 0.4938 | 0.3955 | 0.9002 | 0.0875 | 0.5495 | 0.4572 |
| english | hindi | 0.5424 | 0.5399 | 0.4912 | 0.5150 | 0.5648 | 0.5028 | 0.5377 |
| english | english | 0.7721 | 0.7726 | 0.7453 | 0.8190 | 0.7262 | 0.7804 | 0.8458 |
| english | full | 0.6395 | 0.6458 | 0.5901 | 0.7337 | 0.5578 | 0.6541 | 0.6913 |
| **Full** | **hinglish** | **0.6326** | **0.6101** | **0.5426** | **0.4991** | **0.7210** | **0.5200** | **0.6161** |
| **Full** | **hindi** | **0.5748** | **0.5676** | **0.5286** | **0.4958** | **0.6393** | **0.5117** | **0.5941** |
| **Full** | **english** | **0.7747** | **0.7746** | **0.7747** | **0.7678** | **0.7815** | **0.7712** | **0.8476** |
| **Full** | **full** | **0.6866** | **0.6839** | **0.6687** | **0.6449** | **0.7228** | **0.6566** | **0.7556** |
### Key Observations
- **English phase is the sharpest turning point** — English F1 jumps from 0.596 (after Hindi) to 0.780 in one phase, driven by GloVe's English-centric embeddings.
- **Starting from Hinglish** forces generalisation from noise — the model reaches Hinglish F1=0.539 after only its own phase, a stronger start than Hinglish gets in most v1 orderings.
- **Catastrophic interference is visible** — Hinglish specificity drops from 0.791 → 0.747 → 0.088 as the model progressively shifts language bias. The Full phase restores it to 0.721.
- **Final Full phase AUC = 0.756** matches the best v1 strategies despite a harder starting language, confirming the robustness of the Hinglish-first approach with deeper training.
- **Hindi remains the hardest** (F1=0.512 at final) — consistent with GloVe's limited Hindi vocabulary coverage.
---
## 10. How to Use
```python
import json
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import tokenizer_from_json
from tensorflow.keras.preprocessing.sequence import pad_sequences
from huggingface_hub import hf_hub_download
# Load tokenizer (from v1 repo — same dataset/split)
tokenizer_path = hf_hub_download(repo_id="tuklu/SASC", filename="tokenizer.json")
with open(tokenizer_path) as f:
tokenizer = tokenizer_from_json(f.read())
# Load model
model_path = hf_hub_download(repo_id="tuklu/SASCv2", filename="model.h5")
model = tf.keras.models.load_model(model_path)
# Predict
texts = ["I hate all of them", "Have a great day!"]
sequences = tokenizer.texts_to_sequences(texts)
padded = pad_sequences(sequences, maxlen=100)
probs = model.predict(padded).flatten()
for text, prob in zip(texts, probs):
label = "Hate Speech" if prob > 0.5 else "Non-Hate"
print(f"{label} ({prob:.3f}): {text}")
```
---
## Explainability — SHAP Analysis
We applied **SHAP (SHapley Additive exPlanations)** to the final trained model to understand which words drive hate speech predictions. A `GradientExplainer` runs on the BiLSTM sub-model (embedding layer bypassed — embeddings pre-computed as floats), with 200 background training samples, evaluated on all 4 test sets.
> Full methodology, all strategy comparisons, and detailed word tables: **[SHAP_REPORT.md](SHAP_REPORT.md)**
### Top SHAP Words — Final Model
| Eval | Top Hate Words | Top Non-Hate Words |
|---|---|---|
| English | nas, fags, sicko, sabotage, advocating | grow, barrel, homosexual, pak, join |
| Hindi | वादा, वैज्ञानिकों, ऐ, उतारा, गला | जीतेगा, घोंटने, जिहादी, आपत्तिजनक |
| Hinglish | arey, bahir, punish, papa, interior | online, member, mam, messages, asha |
| Full | blamed, criticized, syntax, grown, sine | underneath, smack, online, hole, clue |




### Key Takeaways
- **Hindi SHAP values are 10× smaller** than English/Hinglish — GloVe has near-zero Hindi coverage; model relies on positional patterns, not word semantics
- **Accusatory framing dominates full-dataset hate markers** (`blamed`, `criticized`, `advocating`) — the 50-epoch Full phase learns that hate speech in this corpus often targets victims through blame/accusation rather than direct slurs
- **"online"** is the most consistent non-hate signal — informational/conversational context across all three languages
- **Hinglish markers are semantically coherent** (`arey` = hey/exclamation in abusive context, `punish`, `interior`) despite code-mixing — v2's 50 epochs on Hinglish-first produced stronger Hinglish feature learning than v1
- **Spurious correlations remain** (`syntax`, `sine`) — inherent limitation of non-contextual GloVe; a BERT-based model would resolve these
---
## Related
- **v1 (all 6 strategies, 8 epochs each):** [tuklu/SASC](https://huggingface.co/tuklu/SASC)
- **Dataset:** [tuklu/nprism](https://huggingface.co/datasets/tuklu/nprism)
---
## Citation
```
@misc{sasc2026,
title={Multilingual Hate Speech Detection via Sequential Transfer Learning (v2)},
author={tuklu},
year={2026},
publisher={HuggingFace},
url={https://huggingface.co/tuklu/SASCv2}
}
```
|