Update README with inline figures and correct paths
Browse files
README.md
CHANGED
|
@@ -49,9 +49,12 @@ model-index:
|
|
| 49 |
2. [The Dataset](#2-the-dataset)
|
| 50 |
3. [Model Architecture](#3-model-architecture)
|
| 51 |
4. [Training Strategy](#4-training-strategy)
|
| 52 |
-
5. [
|
| 53 |
-
6. [
|
| 54 |
-
7. [
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
---
|
| 57 |
|
|
@@ -59,9 +62,7 @@ model-index:
|
|
| 59 |
|
| 60 |
This is **v2** of the SASC sequential transfer learning experiment.
|
| 61 |
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
After every phase the model is evaluated on **all three individual language test sets as well as the full test set**, giving a 4×4 cross-evaluation matrix.
|
| 65 |
|
| 66 |
---
|
| 67 |
|
|
@@ -87,120 +88,234 @@ Dataset: [tuklu/nprism](https://huggingface.co/datasets/tuklu/nprism)
|
|
| 87 |
| Non-Hate (0) | 15,799 | 53.5% |
|
| 88 |
| Hate (1) | 13,707 | 46.5% |
|
| 89 |
|
| 90 |
-

|
|
|
|
|
|
|
| 91 |
|
| 92 |
---
|
| 93 |
|
| 94 |
## 3. Model Architecture
|
| 95 |
|
| 96 |
```
|
| 97 |
-
|
| 98 |
-
|
|
|
|
|
|
|
| 99 |
Bidirectional LSTM (128 units)
|
| 100 |
-
|
|
|
|
| 101 |
Dropout (0.5)
|
| 102 |
-
|
| 103 |
-
Dense (64, ReLU)
|
| 104 |
-
|
| 105 |
-
|
|
|
|
| 106 |
```
|
| 107 |
|
| 108 |
- **Optimizer:** Adam
|
| 109 |
-
- **Loss:** Binary
|
| 110 |
-
- **
|
|
|
|
| 111 |
|
| 112 |
---
|
| 113 |
|
| 114 |
## 4. Training Strategy
|
| 115 |
|
| 116 |
-
| Phase | Data | Epochs | Batch Size |
|
| 117 |
-
|---|---|---|---|
|
| 118 |
-
| 1 — Hinglish | Hinglish
|
| 119 |
-
| 2 — Hindi | Hindi
|
| 120 |
-
| 3 — English | English
|
| 121 |
-
| 4 — Full |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 122 |
|
| 123 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 124 |
|
| 125 |
---
|
| 126 |
|
| 127 |
-
##
|
| 128 |
|
| 129 |
-
|
| 130 |
|
| 131 |
| Phase | Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|
| 132 |
|---|---|---|---|---|---|---|---|---|
|
| 133 |
-
| hinglish | english | 0.5171 | 0.5125 | 0.5738 | 0.0916 | 0.9334 | 0.1580 | 0.5620 |
|
| 134 |
-
| hinglish | hindi | 0.4493 | 0.5000 | 0.4493 | 1.0000 | 0.0000 | 0.6200 | 0.5234 |
|
| 135 |
| hinglish | hinglish | 0.6688 | 0.6378 | 0.6058 | 0.4848 | 0.7908 | 0.5386 | 0.6579 |
|
|
|
|
|
|
|
| 136 |
| hinglish | full | 0.5190 | 0.5133 | 0.4803 | 0.4331 | 0.5935 | 0.4555 | 0.5243 |
|
| 137 |
-
| hindi | english | 0.4711 | 0.4744 | 0.4789 | 0.7878 | 0.1611 | 0.5957 | 0.4292 |
|
| 138 |
-
| hindi | hindi | 0.5834 | 0.5730 | 0.5420 | 0.4705 | 0.6756 | 0.5037 | 0.5949 |
|
| 139 |
| hindi | hinglish | 0.5409 | 0.4885 | 0.3761 | 0.2299 | 0.7470 | 0.2854 | 0.4771 |
|
|
|
|
|
|
|
| 140 |
| hindi | full | 0.5190 | 0.5251 | 0.4859 | 0.6111 | 0.4390 | 0.5414 | 0.5255 |
|
| 141 |
-
| english | english | 0.7721 | 0.7726 | 0.7453 | 0.8190 | 0.7262 | 0.7804 | 0.8458 |
|
| 142 |
-
| english | hindi | 0.5424 | 0.5399 | 0.4912 | 0.5150 | 0.5648 | 0.5028 | 0.5377 |
|
| 143 |
| english | hinglish | 0.4115 | 0.4938 | 0.3955 | 0.9002 | 0.0875 | 0.5495 | 0.4572 |
|
|
|
|
|
|
|
| 144 |
| english | full | 0.6395 | 0.6458 | 0.5901 | 0.7337 | 0.5578 | 0.6541 | 0.6913 |
|
| 145 |
-
| **Full** | **english** | **0.7747** | **0.7746** | **0.7747** | **0.7678** | **0.7815** | **0.7712** | **0.8476** |
|
| 146 |
-
| **Full** | **hindi** | **0.5748** | **0.5676** | **0.5286** | **0.4958** | **0.6393** | **0.5117** | **0.5941** |
|
| 147 |
| **Full** | **hinglish** | **0.6326** | **0.6101** | **0.5426** | **0.4991** | **0.7210** | **0.5200** | **0.6161** |
|
|
|
|
|
|
|
| 148 |
| **Full** | **full** | **0.6866** | **0.6839** | **0.6687** | **0.6449** | **0.7228** | **0.6566** | **0.7556** |
|
| 149 |
|
| 150 |
### Key Observations
|
| 151 |
|
| 152 |
-
- **English phase is the turning point**
|
| 153 |
-
- **Starting from Hinglish** forces
|
| 154 |
-
- **
|
| 155 |
-
-
|
|
|
|
| 156 |
|
| 157 |
---
|
| 158 |
|
| 159 |
-
##
|
| 160 |
-
|
| 161 |
-
Training curves and evaluation plots for every phase × language combination are in the `figures/hinglish_to_hindi_to_english/` directory.
|
| 162 |
-
|
| 163 |
-
**Training curves (Accuracy & Loss):**
|
| 164 |
-
- `Phase_hinglish_curves.png`
|
| 165 |
-
- `Phase_hindi_curves.png`
|
| 166 |
-
- `Phase_english_curves.png`
|
| 167 |
-
- `Phase_Full_curves.png`
|
| 168 |
-
|
| 169 |
-
**Per-phase evaluation (CM / ROC / PR / F1 curve) for each language + full:**
|
| 170 |
-
- `Phase_{phase}_eval_{lang}_cm.png`
|
| 171 |
-
- `Phase_{phase}_eval_{lang}_roc.png`
|
| 172 |
-
- `Phase_{phase}_eval_{lang}_pr.png`
|
| 173 |
-
- `Phase_{phase}_eval_{lang}_f1.png`
|
| 174 |
-
|
| 175 |
-
---
|
| 176 |
-
|
| 177 |
-
## 7. How to Use
|
| 178 |
|
| 179 |
```python
|
| 180 |
-
import numpy as np
|
| 181 |
import json
|
| 182 |
-
|
|
|
|
|
|
|
| 183 |
from tensorflow.keras.preprocessing.sequence import pad_sequences
|
|
|
|
| 184 |
|
| 185 |
-
# Load
|
| 186 |
-
|
|
|
|
|
|
|
| 187 |
|
| 188 |
-
# Load
|
| 189 |
-
|
| 190 |
-
|
| 191 |
-
tokenizer = tokenizer_from_json(json.load(f) if isinstance(json.load(open("tokenizer.json")), str) else open("tokenizer.json").read())
|
| 192 |
|
| 193 |
# Predict
|
| 194 |
-
texts = ["
|
| 195 |
-
|
| 196 |
-
|
| 197 |
-
|
| 198 |
-
|
|
|
|
|
|
|
|
|
|
| 199 |
```
|
| 200 |
|
| 201 |
---
|
| 202 |
|
| 203 |
## Related
|
| 204 |
|
| 205 |
-
- **v1 (all 6 strategies, 8 epochs):** [tuklu/SASC](https://huggingface.co/tuklu/SASC)
|
| 206 |
- **Dataset:** [tuklu/nprism](https://huggingface.co/datasets/tuklu/nprism)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
2. [The Dataset](#2-the-dataset)
|
| 50 |
3. [Model Architecture](#3-model-architecture)
|
| 51 |
4. [Training Strategy](#4-training-strategy)
|
| 52 |
+
5. [Phase 1 — Hinglish](#5-phase-1--hinglish)
|
| 53 |
+
6. [Phase 2 — Hindi](#6-phase-2--hindi)
|
| 54 |
+
7. [Phase 3 — English](#7-phase-3--english)
|
| 55 |
+
8. [Phase 4 — Full Dataset](#8-phase-4--full-dataset)
|
| 56 |
+
9. [Full Results Table](#9-full-results-table)
|
| 57 |
+
10. [How to Use](#10-how-to-use)
|
| 58 |
|
| 59 |
---
|
| 60 |
|
|
|
|
| 62 |
|
| 63 |
This is **v2** of the SASC sequential transfer learning experiment.
|
| 64 |
|
| 65 |
+
v1 ran all 6 permutations of [English, Hindi, Hinglish] with **8 epochs** per phase. v2 focuses on a single strategy — `Hinglish → Hindi → English → Full` — but trains for **50 epochs per phase (200 total)**. The key new addition: after every phase the model is evaluated on **all three individual language test sets AND the full test set**, giving a complete 4×4 cross-evaluation matrix showing how knowledge transfers across languages.
|
|
|
|
|
|
|
| 66 |
|
| 67 |
---
|
| 68 |
|
|
|
|
| 88 |
| Non-Hate (0) | 15,799 | 53.5% |
|
| 89 |
| Hate (1) | 13,707 | 46.5% |
|
| 90 |
|
| 91 |
+

|
| 92 |
+
|
| 93 |
+
The dataset is dominated by English (50.8%). GloVe embeddings are also English-centric, which directly explains why the English phase produces the sharpest accuracy jump regardless of training order.
|
| 94 |
|
| 95 |
---
|
| 96 |
|
| 97 |
## 3. Model Architecture
|
| 98 |
|
| 99 |
```
|
| 100 |
+
Input: Text sequence (max 100 tokens)
|
| 101 |
+
↓
|
| 102 |
+
GloVe Embedding Layer (vocab: 50,000 × 300d) — FROZEN
|
| 103 |
+
↓
|
| 104 |
Bidirectional LSTM (128 units)
|
| 105 |
+
→ reads sentence left-to-right AND right-to-left
|
| 106 |
+
↓
|
| 107 |
Dropout (0.5)
|
| 108 |
+
↓
|
| 109 |
+
Dense Layer (64 neurons, ReLU)
|
| 110 |
+
↓
|
| 111 |
+
Output Layer (1 neuron, Sigmoid)
|
| 112 |
+
→ > 0.5 = Hate Speech | ≤ 0.5 = Non-Hate
|
| 113 |
```
|
| 114 |
|
| 115 |
- **Optimizer:** Adam
|
| 116 |
+
- **Loss:** Binary Cross-Entropy
|
| 117 |
+
- **Max sequence length:** 100 tokens
|
| 118 |
+
- **Vocab size:** 50,000
|
| 119 |
|
| 120 |
---
|
| 121 |
|
| 122 |
## 4. Training Strategy
|
| 123 |
|
| 124 |
+
| Phase | Training Data | Epochs | Batch Size | Samples |
|
| 125 |
+
|---|---|---|---|---|
|
| 126 |
+
| 1 — Hinglish | Hinglish subset | 50 | 32 | ~2,908 |
|
| 127 |
+
| 2 — Hindi | Hindi subset | 50 | 32 | ~5,940 |
|
| 128 |
+
| 3 — English | English subset | 50 | 32 | ~8,856 |
|
| 129 |
+
| 4 — Full | All shuffled | 50 | 64 | 17,704 |
|
| 130 |
+
|
| 131 |
+
The **same model** carries its weights through all 4 phases — no resets between languages. After each phase the model is evaluated against all three language-specific test sets and the full test set.
|
| 132 |
+
|
| 133 |
+
---
|
| 134 |
+
|
| 135 |
+
## 5. Phase 1 — Hinglish
|
| 136 |
+
|
| 137 |
+
**Training on Hinglish only** (2,908 samples, 50 epochs). The model starts cold. Hinglish is code-mixed and GloVe has limited coverage — the model learns from sequential patterns rather than word semantics.
|
| 138 |
+
|
| 139 |
+

|
| 140 |
+
|
| 141 |
+
### Evaluation after Phase 1
|
| 142 |
+
|
| 143 |
+
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|
| 144 |
+
|---|---|---|---|---|---|---|---|
|
| 145 |
+
| Hinglish | 0.6688 | 0.6378 | 0.6058 | 0.4848 | 0.7908 | 0.5386 | 0.6579 |
|
| 146 |
+
| Hindi | 0.4493 | 0.5000 | 0.4493 | 1.0000 | 0.0000 | 0.6200 | 0.5234 |
|
| 147 |
+
| English | 0.5171 | 0.5125 | 0.5738 | 0.0916 | 0.9334 | 0.1580 | 0.5620 |
|
| 148 |
+
| Full | 0.5190 | 0.5133 | 0.4803 | 0.4331 | 0.5935 | 0.4555 | 0.5243 |
|
| 149 |
+
|
| 150 |
+
The Hindi result (Recall=1.0, Specificity=0.0) shows the model predicts **everything as hate** on Hindi — it has no Hindi-specific knowledge yet. English performance is near-random. Hinglish F1=0.539 shows the model has learned something useful from its own language.
|
| 151 |
+
|
| 152 |
+
| Eval On | Confusion Matrix | ROC | Precision-Recall | F1 vs Threshold |
|
| 153 |
+
|---|---|---|---|---|
|
| 154 |
+
| Hinglish |  |  |  |  |
|
| 155 |
+
| Hindi |  |  |  |  |
|
| 156 |
+
| English |  |  |  |  |
|
| 157 |
+
| Full |  |  |  |  |
|
| 158 |
+
|
| 159 |
+
---
|
| 160 |
+
|
| 161 |
+
## 6. Phase 2 — Hindi
|
| 162 |
+
|
| 163 |
+
**Training on Hindi** (5,940 samples, 50 epochs). GloVe has limited Hindi coverage so the model must rely on contextual patterns. The struggle here is deliberate — it builds language-agnostic hate detection features.
|
| 164 |
+
|
| 165 |
+

|
| 166 |
+
|
| 167 |
+
### Evaluation after Phase 2
|
| 168 |
+
|
| 169 |
+
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|
| 170 |
+
|---|---|---|---|---|---|---|---|
|
| 171 |
+
| Hinglish | 0.5409 | 0.4885 | 0.3761 | 0.2299 | 0.7470 | 0.2854 | 0.4771 |
|
| 172 |
+
| Hindi | 0.5834 | 0.5730 | 0.5420 | 0.4705 | 0.6756 | 0.5037 | 0.5949 |
|
| 173 |
+
| English | 0.4711 | 0.4744 | 0.4789 | 0.7878 | 0.1611 | 0.5957 | 0.4292 |
|
| 174 |
+
| Full | 0.5190 | 0.5251 | 0.4859 | 0.6111 | 0.4390 | 0.5414 | 0.5255 |
|
| 175 |
+
|
| 176 |
+
Hindi F1 improves to 0.504. Hinglish drops — the model has partially overwritten Hinglish-specific patterns. English recall spikes (high false positives) showing the model is now biased toward predicting hate. This is the expected "catastrophic interference" that the Full phase resolves.
|
| 177 |
+
|
| 178 |
+
| Eval On | Confusion Matrix | ROC | Precision-Recall | F1 vs Threshold |
|
| 179 |
+
|---|---|---|---|---|
|
| 180 |
+
| Hinglish |  |  |  |  |
|
| 181 |
+
| Hindi |  |  |  |  |
|
| 182 |
+
| English |  |  |  |  |
|
| 183 |
+
| Full |  |  |  |  |
|
| 184 |
+
|
| 185 |
+
---
|
| 186 |
+
|
| 187 |
+
## 7. Phase 3 — English
|
| 188 |
+
|
| 189 |
+
**Training on English** (8,856 samples, 50 epochs). This is the turning point. GloVe embeddings align well with English — the model jumps sharply and the English-phase knowledge partially generalises back to the other languages.
|
| 190 |
+
|
| 191 |
+

|
| 192 |
+
|
| 193 |
+
### Evaluation after Phase 3
|
| 194 |
+
|
| 195 |
+
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|
| 196 |
+
|---|---|---|---|---|---|---|---|
|
| 197 |
+
| Hinglish | 0.4115 | 0.4938 | 0.3955 | 0.9002 | 0.0875 | 0.5495 | 0.4572 |
|
| 198 |
+
| Hindi | 0.5424 | 0.5399 | 0.4912 | 0.5150 | 0.5648 | 0.5028 | 0.5377 |
|
| 199 |
+
| **English** | **0.7721** | **0.7726** | **0.7453** | **0.8190** | **0.7262** | **0.7804** | **0.8458** |
|
| 200 |
+
| Full | 0.6395 | 0.6458 | 0.5901 | 0.7337 | 0.5578 | 0.6541 | 0.6913 |
|
| 201 |
+
|
| 202 |
+
English F1 leaps to 0.780 — the model now performs strongly on its native language. Full AUC reaches 0.691. Hinglish specificity collapses again (high recall, low precision) — the model over-predicts hate on unseen languages after English fine-tuning.
|
| 203 |
+
|
| 204 |
+
| Eval On | Confusion Matrix | ROC | Precision-Recall | F1 vs Threshold |
|
| 205 |
+
|---|---|---|---|---|
|
| 206 |
+
| Hinglish |  |  |  |  |
|
| 207 |
+
| Hindi |  |  |  |  |
|
| 208 |
+
| English |  |  |  |  |
|
| 209 |
+
| Full |  |  |  |  |
|
| 210 |
+
|
| 211 |
+
---
|
| 212 |
+
|
| 213 |
+
## 8. Phase 4 — Full Dataset
|
| 214 |
+
|
| 215 |
+
**Training on the full shuffled dataset** (17,704 samples, 50 epochs). This consolidation phase exposes the model to all three languages simultaneously, balancing out the per-language biases accumulated during sequential training.
|
| 216 |
+
|
| 217 |
+

|
| 218 |
|
| 219 |
+
### Evaluation after Phase 4 (Final Model)
|
| 220 |
+
|
| 221 |
+
| Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|
| 222 |
+
|---|---|---|---|---|---|---|---|
|
| 223 |
+
| Hinglish | 0.6326 | 0.6101 | 0.5426 | 0.4991 | 0.7210 | 0.5200 | 0.6161 |
|
| 224 |
+
| Hindi | 0.5748 | 0.5676 | 0.5286 | 0.4958 | 0.6393 | 0.5117 | 0.5941 |
|
| 225 |
+
| **English** | **0.7747** | **0.7746** | **0.7747** | **0.7678** | **0.7815** | **0.7712** | **0.8476** |
|
| 226 |
+
| **Full** | **0.6866** | **0.6839** | **0.6687** | **0.6449** | **0.7228** | **0.6566** | **0.7556** |
|
| 227 |
+
|
| 228 |
+
The Full phase restores balance across all languages. Hinglish specificity recovers to 0.721 (from 0.088 after English phase). Full-dataset AUC reaches **0.756** — the best of all phases. English performance is preserved at F1=0.771 while Hinglish and Hindi both improve substantially from their post-English-phase collapse.
|
| 229 |
+
|
| 230 |
+
| Eval On | Confusion Matrix | ROC | Precision-Recall | F1 vs Threshold |
|
| 231 |
+
|---|---|---|---|---|
|
| 232 |
+
| Hinglish |  |  |  |  |
|
| 233 |
+
| Hindi |  |  |  |  |
|
| 234 |
+
| English |  |  |  |  |
|
| 235 |
+
| Full |  |  |  |  |
|
| 236 |
|
| 237 |
---
|
| 238 |
|
| 239 |
+
## 9. Full Results Table
|
| 240 |
|
| 241 |
+
Complete 16-row cross-evaluation (Phase × Eval Language):
|
| 242 |
|
| 243 |
| Phase | Eval On | Accuracy | Balanced Acc | Precision | Recall | Specificity | F1 | ROC-AUC |
|
| 244 |
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
| 245 |
| hinglish | hinglish | 0.6688 | 0.6378 | 0.6058 | 0.4848 | 0.7908 | 0.5386 | 0.6579 |
|
| 246 |
+
| hinglish | hindi | 0.4493 | 0.5000 | 0.4493 | 1.0000 | 0.0000 | 0.6200 | 0.5234 |
|
| 247 |
+
| hinglish | english | 0.5171 | 0.5125 | 0.5738 | 0.0916 | 0.9334 | 0.1580 | 0.5620 |
|
| 248 |
| hinglish | full | 0.5190 | 0.5133 | 0.4803 | 0.4331 | 0.5935 | 0.4555 | 0.5243 |
|
|
|
|
|
|
|
| 249 |
| hindi | hinglish | 0.5409 | 0.4885 | 0.3761 | 0.2299 | 0.7470 | 0.2854 | 0.4771 |
|
| 250 |
+
| hindi | hindi | 0.5834 | 0.5730 | 0.5420 | 0.4705 | 0.6756 | 0.5037 | 0.5949 |
|
| 251 |
+
| hindi | english | 0.4711 | 0.4744 | 0.4789 | 0.7878 | 0.1611 | 0.5957 | 0.4292 |
|
| 252 |
| hindi | full | 0.5190 | 0.5251 | 0.4859 | 0.6111 | 0.4390 | 0.5414 | 0.5255 |
|
|
|
|
|
|
|
| 253 |
| english | hinglish | 0.4115 | 0.4938 | 0.3955 | 0.9002 | 0.0875 | 0.5495 | 0.4572 |
|
| 254 |
+
| english | hindi | 0.5424 | 0.5399 | 0.4912 | 0.5150 | 0.5648 | 0.5028 | 0.5377 |
|
| 255 |
+
| english | english | 0.7721 | 0.7726 | 0.7453 | 0.8190 | 0.7262 | 0.7804 | 0.8458 |
|
| 256 |
| english | full | 0.6395 | 0.6458 | 0.5901 | 0.7337 | 0.5578 | 0.6541 | 0.6913 |
|
|
|
|
|
|
|
| 257 |
| **Full** | **hinglish** | **0.6326** | **0.6101** | **0.5426** | **0.4991** | **0.7210** | **0.5200** | **0.6161** |
|
| 258 |
+
| **Full** | **hindi** | **0.5748** | **0.5676** | **0.5286** | **0.4958** | **0.6393** | **0.5117** | **0.5941** |
|
| 259 |
+
| **Full** | **english** | **0.7747** | **0.7746** | **0.7747** | **0.7678** | **0.7815** | **0.7712** | **0.8476** |
|
| 260 |
| **Full** | **full** | **0.6866** | **0.6839** | **0.6687** | **0.6449** | **0.7228** | **0.6566** | **0.7556** |
|
| 261 |
|
| 262 |
### Key Observations
|
| 263 |
|
| 264 |
+
- **English phase is the sharpest turning point** — English F1 jumps from 0.596 (after Hindi) to 0.780 in one phase, driven by GloVe's English-centric embeddings.
|
| 265 |
+
- **Starting from Hinglish** forces generalisation from noise — the model reaches Hinglish F1=0.539 after only its own phase, a stronger start than Hinglish gets in most v1 orderings.
|
| 266 |
+
- **Catastrophic interference is visible** — Hinglish specificity drops from 0.791 → 0.747 → 0.088 as the model progressively shifts language bias. The Full phase restores it to 0.721.
|
| 267 |
+
- **Final Full phase AUC = 0.756** matches the best v1 strategies despite a harder starting language, confirming the robustness of the Hinglish-first approach with deeper training.
|
| 268 |
+
- **Hindi remains the hardest** (F1=0.512 at final) — consistent with GloVe's limited Hindi vocabulary coverage.
|
| 269 |
|
| 270 |
---
|
| 271 |
|
| 272 |
+
## 10. How to Use
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 273 |
|
| 274 |
```python
|
|
|
|
| 275 |
import json
|
| 276 |
+
import numpy as np
|
| 277 |
+
import tensorflow as tf
|
| 278 |
+
from tensorflow.keras.preprocessing.text import tokenizer_from_json
|
| 279 |
from tensorflow.keras.preprocessing.sequence import pad_sequences
|
| 280 |
+
from huggingface_hub import hf_hub_download
|
| 281 |
|
| 282 |
+
# Load tokenizer (from v1 repo — same dataset/split)
|
| 283 |
+
tokenizer_path = hf_hub_download(repo_id="tuklu/SASC", filename="tokenizer.json")
|
| 284 |
+
with open(tokenizer_path) as f:
|
| 285 |
+
tokenizer = tokenizer_from_json(f.read())
|
| 286 |
|
| 287 |
+
# Load model
|
| 288 |
+
model_path = hf_hub_download(repo_id="tuklu/SASCv2", filename="model.h5")
|
| 289 |
+
model = tf.keras.models.load_model(model_path)
|
|
|
|
| 290 |
|
| 291 |
# Predict
|
| 292 |
+
texts = ["I hate all of them", "Have a great day!"]
|
| 293 |
+
sequences = tokenizer.texts_to_sequences(texts)
|
| 294 |
+
padded = pad_sequences(sequences, maxlen=100)
|
| 295 |
+
probs = model.predict(padded).flatten()
|
| 296 |
+
|
| 297 |
+
for text, prob in zip(texts, probs):
|
| 298 |
+
label = "Hate Speech" if prob > 0.5 else "Non-Hate"
|
| 299 |
+
print(f"{label} ({prob:.3f}): {text}")
|
| 300 |
```
|
| 301 |
|
| 302 |
---
|
| 303 |
|
| 304 |
## Related
|
| 305 |
|
| 306 |
+
- **v1 (all 6 strategies, 8 epochs each):** [tuklu/SASC](https://huggingface.co/tuklu/SASC)
|
| 307 |
- **Dataset:** [tuklu/nprism](https://huggingface.co/datasets/tuklu/nprism)
|
| 308 |
+
|
| 309 |
+
---
|
| 310 |
+
|
| 311 |
+
## Citation
|
| 312 |
+
|
| 313 |
+
```
|
| 314 |
+
@misc{sasc2026,
|
| 315 |
+
title={Multilingual Hate Speech Detection via Sequential Transfer Learning (v2)},
|
| 316 |
+
author={tuklu},
|
| 317 |
+
year={2026},
|
| 318 |
+
publisher={HuggingFace},
|
| 319 |
+
url={https://huggingface.co/tuklu/SASCv2}
|
| 320 |
+
}
|
| 321 |
+
```
|