NepaliBERT — Nepali Hate Content Classification
Fine-tuned NepaliBERT for multi-class hate content classification of Nepali social media text. The model is specifically optimized for Devanagari script Nepali and handles mixed-script inputs through a comprehensive preprocessing pipeline.
Model Description
This model was developed as part of a Bachelor of Computer Engineering final project at Khwopa College of Engineering, Tribhuvan University (February 2026). It classifies Nepali social media comments into four categories targeting different types of offensive content.
Base model: Rajan/NepaliBERT (110M parameters, 12 transformer layers, pre-trained on a large Nepali corpus using masked language modelling)
Task: Multi-class text classification (4 classes)
Languages: Nepali (Devanagari primary), Romanized Nepali, code-mixed
Compared to XLM-RoBERTa Large (our other model): NepaliBERT's Nepali-specific pre-training gives it stronger Devanagari understanding and the best OR (Offensive-Racist) class F1 (0.4833) among all evaluated models. However, it has limited exposure to Romanized Nepali and English, making XLM-RoBERTa more robust on heavily code-mixed inputs.
Labels
| ID | Label | Description |
|---|---|---|
| 0 | NON_OFFENSIVE |
Text containing no offensive content |
| 1 | OTHER_OFFENSIVE |
General offensive content not targeting specific groups |
| 2 | OFFENSIVE_RACIST |
Content targeting individuals/groups based on ethnicity, race, or caste |
| 3 | OFFENSIVE_SEXIST |
Content targeting individuals based on gender |
Usage
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="UDHOV/nepalibert-nepali-hate-classification"
)
# Devanagari input
classifier("यो राम्रो छ")
# Romanized Nepali (will be preprocessed via transliteration ideally)
classifier("yo ramro cha")
Or manually:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("UDHOV/nepalibert-nepali-hate-classification")
model = AutoModelForSequenceClassification.from_pretrained("UDHOV/nepalibert-nepali-hate-classification")
text = "तिमी देखी घृणा लाग्छ"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
logits = model(**inputs).logits
predicted_class = logits.argmax().item()
print(model.config.id2label[predicted_class])
Preprocessing Pipeline
The model was trained on text processed through a 5-stage pipeline:
- Script Detection — Unicode-based confidence scoring to classify input as Devanagari, Romanized Nepali, or English
- Script Unification — Romanized Nepali transliterated to Devanagari via ITRANS; English translated to Nepali via Deep Translator API
- Emoji Processing — 180+ emojis semantically mapped to Nepali equivalents; unknown emojis preserved; 18-dimensional emoji feature vector extracted
- Text Cleaning — URL removal, @mention removal, hashtag handling, whitespace normalization
- Feature Extraction — Script metadata, emoji features, and text statistics merged with cleaned text
Note: NepaliBERT's WordPiece tokenizer is optimized for Devanagari. For best results, pre-process Romanized or English inputs through the transliteration/translation pipeline before passing to this model.
Training Data
- Source: Niraula et al. (2021) — Offensive Language Detection in Nepali Social Media (ACL Anthology)
- Platform: Facebook and YouTube comments
- Total samples: 7,625
| Split | NO | OO | OR | OS | Total |
|---|---|---|---|---|---|
| Train | 3,206 (57.7%) | 1,759 (31.6%) | 376 (6.8%) | 214 (3.8%) | 5,555 |
| Validation | 356 (57.5%) | 195 (31.5%) | 42 (6.8%) | 27 (4.4%) | 620 |
| Test | 896 (62.1%) | 486 (33.7%) | 49 (3.4%) | 19 (1.3%) | 1,450 |
Class imbalance: NO vs OS imbalance ratio = 14.98×. Addressed via class-weighted cross-entropy loss with weights capped in the range [0.5, 3.0] to prevent extreme gradient updates from the severely under-represented OS class.
Training Configuration
| Hyperparameter | Value |
|---|---|
| Optimizer | AdamW |
| Learning rate | 2e-5 (discriminative LR strategy) |
| Weight decay | 0.01 |
| Warmup steps | 10% of total steps |
| LR schedule | Linear decay |
| Batch size | 16 (grad accum × 2 = effective 32) |
| Max epochs | 5 |
| Early stopping patience | 2 epochs |
| Max sequence length | 128 tokens |
| Dropout (classification head) | 0.3 |
| Label smoothing | 0.05 |
| Class weight capping | [0.5, 3.0] |
| Gradient clipping | 1.0 |
| Loss | Class-weighted cross-entropy |
Training took approximately 3,759 seconds (~62.7 minutes) on a single GPU.
Evaluation Results
Test Set Performance
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| NON_OFFENSIVE | 0.7805 | 0.7701 | 0.7753 | 896 |
| OTHER_OFFENSIVE | 0.6102 | 0.5926 | 0.6013 | 486 |
| OFFENSIVE_RACIST | 0.4085 | 0.5918 | 0.4833 | 49 |
| OFFENSIVE_SEXIST | 0.1739 | 0.2105 | 0.1905 | 19 |
| Macro Avg | 0.4933 | 0.5413 | 0.5126 | 1,450 |
| Weighted Avg | 0.7029 | 0.6972 | 0.6994 | 1,450 |
| Accuracy | 0.6972 | 1,450 |
Validation Set Performance (Best Checkpoint)
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| NON_OFFENSIVE | 0.7961 | 0.8118 | 0.8039 | 356 |
| OTHER_OFFENSIVE | 0.6609 | 0.5897 | 0.6233 | 195 |
| OFFENSIVE_RACIST | 0.6727 | 0.8810 | 0.7629 | 42 |
| OFFENSIVE_SEXIST | 0.8214 | 0.8519 | 0.8364 | 27 |
| Macro Avg | 0.7378 | 0.7836 | 0.7566 | 620 |
| Accuracy | 0.7484 | 620 |
NepaliBERT achieved the highest validation macro F1 (0.7566) among all evaluated models, outperforming even XLM-RoBERTa Large (0.7392 val macro F1). The validation-to-test gap is primarily explained by distributional shift in the OR and OS minority classes, not overfitting (train-val loss gap = 0.066).
Primary metric: Macro F1-score. Accuracy is misleading given class imbalance; macro F1 weights all classes equally, making it the appropriate metric for evaluating minority hate class performance.
Training Dynamics
Training proceeded over approximately 1,000 gradient steps in three phases:
- Phase 1 (steps 0–300): Rapid co-descent of train and validation loss (1.50 → 1.00), faster than XLM-RoBERTa due to Nepali-specific pre-training. Validation F1 rises from 0.26 to 0.47.
- Phase 2 (steps 300–600): Training loss continues declining (~0.90); validation loss stabilizes around 1.00–1.02. Validation F1 improves to 0.65 as OO and OR class discrimination refines.
- Phase 3 (steps 600–1000): Validation F1 peaks near 0.75 at step 700, then settles at 0.72. Post-step-600 divergence between F1 and accuracy reflects a trade-off between majority class accuracy and minority class precision.
The final train-validation loss gap of 0.066 confirms minimal overfitting; poor OS test performance is a data distribution issue rather than model overfitting.
Comparison with Other Models
| Approach | Model | Accuracy | Macro F1 |
|---|---|---|---|
| Classical ML | Logistic Regression (TF-IDF) | 0.7538 | 0.5701 |
| Classical ML | SVM | 0.7552 | 0.5502 |
| Deep Learning | GRU + Word2Vec | — | 0.3307 (test) |
| Transformer | XLM-RoBERTa Large | 0.7034 | 0.5465 |
| Transformer | NepaliBERT (this model) | 0.6972 | 0.5126 |
Per-Class F1 Comparison (Test Set)
| Model | Macro F1 | NO | OO | OR | OS |
|---|---|---|---|---|---|
| Logistic Regression | 0.5701 | 0.8225 | 0.6722 | 0.5000 | 0.2857 |
| SVM | 0.5502 | 0.8288 | 0.6659 | 0.4660 | 0.2400 |
| XLM-RoBERTa Large | 0.5465 | 0.7825 | 0.6306 | 0.3731 | 0.4000 |
| NepaliBERT (this model) | 0.5126 | 0.7753 | 0.6013 | 0.4833 | 0.1905 |
Key finding: NepaliBERT achieves the best OR class F1 (0.4833) among all models, outperforming XLM-RoBERTa Large (0.3731), confirming that Nepali domain pre-training provides a meaningful advantage for ethnicity/caste-related hate content. XLM-RoBERTa Large outperforms NepaliBERT on the OS class (0.4000 vs 0.1905).
Limitations
- Romanized Nepali coverage: NepaliBERT's pre-training corpus is predominantly Devanagari, limiting its ability to handle Romanized Nepali without prior transliteration. The OR test set contains 59.2% Romanized script vs 46.1% in training, contributing to the validation-to-test gap.
- OS class collapse: With only 19 OS test samples, high length mismatch (train avg 13.1 words vs test avg 19.9 words), and narrow training vocabulary, OS results (F1 = 0.1905) should be interpreted with significant caution.
- Optimal checkpoint sensitivity: NepaliBERT shows a more pronounced F1 peak-and-drop than XLM-RoBERTa, making it more sensitive to early stopping checkpoint selection.
- Preprocessing dependency: Performance on Romanized or English inputs degrades without prior transliteration/translation through the preprocessing pipeline.
- Language scope: Optimized specifically for Nepali. Not evaluated on other South Asian languages.
Intended Use
- Automated hate content moderation on Nepali social media platforms, especially where content is primarily in Devanagari script
- Research on Nepali-specific NLP and low-resource hate speech detection
- Comparative study of language-specific vs multilingual transformer models
- Explainable AI integration — this model was evaluated with LIME, SHAP, and Captum-based Integrated Gradients for token-level attribution
Out-of-scope uses: This model should not be used as the sole decision-making system for content removal without human review. OS class predictions carry particularly high uncertainty due to extremely limited test support.
Explainability
The deployment system integrates three complementary XAI methods for token-level explanation of predictions:
- LIME — Local surrogate model via word masking perturbations
- SHAP — Shapley value attribution (KernelSHAP)
- Integrated Gradients (Captum) — Gradient-based attribution along input-to-baseline path
Citation
If you use this model, please cite the original dataset:
@inproceedings{niraula2021offensive,
title={Offensive Language Detection in Nepali Social Media},
author={Niraula, Nobal B. and Dulal, Saurav and Koirala, Diwa},
booktitle={Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)},
pages={67--75},
year={2021}
}
And the base model:
@article{thapa2024nepali,
title={Development of Pre-trained Transformer-based Models for the Nepali Language},
author={Thapa, Prashant and Sharma, Prajwal and Kharel, Aman},
journal={Transactions on Asian and Low-Resource Language Information Processing},
year={2024}
}
Authors
Uddav Rajbhandari
Department of Computer and Electronics Engineering Khwopa College of Engineering, Tribhuvan University, Nepal (2026)
- Downloads last month
- 16
Model tree for UDHOV/nepalibert-nepali-hate-classification
Base model
Rajan/NepaliBERT