UDHOV
/

nepalibert-nepali-hate-classification

@@ -1,199 +1,273 @@
 ---
-library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
+language:
+  - ne
+tags:
+  - text-classification
+  - hate-speech-detection
+  - offensive-language
+  - nepali
+  - devanagari
+  - low-resource-nlp
+  - bert
+  - nepali-bert
+datasets:
+  - niraula2021nepali
+metrics:
+  - f1
+  - accuracy
+license: mit
+pipeline_tag: text-classification
+base_model: Rajan/NepaliBERT
+paper: https://aclanthology.org/2021.woah-1.7
 ---
+# NepaliBERT — Nepali Hate Content Classification
+Fine-tuned [NepaliBERT](https://huggingface.co/Rajan/NepaliBERT) for multi-class hate content classification of Nepali social media text. The model is specifically optimized for Devanagari script Nepali and handles mixed-script inputs through a comprehensive preprocessing pipeline.
+## Model Description
+This model was developed as part of a Bachelor of Computer Engineering final project at Khwopa College of Engineering, Tribhuvan University (February 2026). It classifies Nepali social media comments into four categories targeting different types of offensive content.
+**Base model:** `Rajan/NepaliBERT` (110M parameters, 12 transformer layers, pre-trained on a large Nepali corpus using masked language modelling)
+**Task:** Multi-class text classification (4 classes)
+**Languages:** Nepali (Devanagari primary), Romanized Nepali, code-mixed
+> **Compared to XLM-RoBERTa Large (our other model):** NepaliBERT's Nepali-specific pre-training gives it stronger Devanagari understanding and the best OR (Offensive-Racist) class F1 (0.4833) among all evaluated models. However, it has limited exposure to Romanized Nepali and English, making XLM-RoBERTa more robust on heavily code-mixed inputs.
+---
+## Labels
+| ID | Label | Description |
+|----|-------|-------------|
+| 0 | `NON_OFFENSIVE` | Text containing no offensive content |
+| 1 | `OTHER_OFFENSIVE` | General offensive content not targeting specific groups |
+| 2 | `OFFENSIVE_RACIST` | Content targeting individuals/groups based on ethnicity, race, or caste |
+| 3 | `OFFENSIVE_SEXIST` | Content targeting individuals based on gender |
+---
+## Usage
+```python
+from transformers import pipeline
+classifier = pipeline(
+    "text-classification",
+    model="UDHOV/nepalibert-nepali-hate-classification"
+)
+# Devanagari input
+classifier("यो राम्रो छ")
+# Romanized Nepali (will be preprocessed via transliteration ideally)
+classifier("yo ramro cha")
+```
+Or manually:
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+tokenizer = AutoTokenizer.from_pretrained("UDHOV/nepalibert-nepali-hate-classification")
+model = AutoModelForSequenceClassification.from_pretrained("UDHOV/nepalibert-nepali-hate-classification")
+text = "तिमी देखी घृणा लाग्छ"
+inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
+with torch.no_grad():
+    logits = model(**inputs).logits
+predicted_class = logits.argmax().item()
+print(model.config.id2label[predicted_class])
+```
+---
+## Preprocessing Pipeline
+The model was trained on text processed through a 5-stage pipeline:
+1. **Script Detection** — Unicode-based confidence scoring to classify input as Devanagari, Romanized Nepali, or English
+2. **Script Unification** — Romanized Nepali transliterated to Devanagari via ITRANS; English translated to Nepali via Deep Translator API
+3. **Emoji Processing** — 180+ emojis semantically mapped to Nepali equivalents; unknown emojis preserved; 18-dimensional emoji feature vector extracted
+4. **Text Cleaning** — URL removal, @mention removal, hashtag handling, whitespace normalization
+5. **Feature Extraction** — Script metadata, emoji features, and text statistics merged with cleaned text
+> **Note:** NepaliBERT's WordPiece tokenizer is optimized for Devanagari. For best results, pre-process Romanized or English inputs through the transliteration/translation pipeline before passing to this model.
+---
+## Training Data
+- **Source:** Niraula et al. (2021) — *Offensive Language Detection in Nepali Social Media* ([ACL Anthology](https://aclanthology.org/2021.woah-1.7))
+- **Platform:** Facebook and YouTube comments
+- **Total samples:** 7,625
+| Split | NO | OO | OR | OS | Total |
+|-------|----|----|----|----|-------|
+| Train | 3,206 (57.7%) | 1,759 (31.6%) | 376 (6.8%) | 214 (3.8%) | 5,555 |
+| Validation | 356 (57.5%) | 195 (31.5%) | 42 (6.8%) | 27 (4.4%) | 620 |
+| Test | 896 (62.1%) | 486 (33.7%) | 49 (3.4%) | 19 (1.3%) | 1,450 |
+**Class imbalance:** NO vs OS imbalance ratio = 14.98×. Addressed via class-weighted cross-entropy loss with weights capped in the range [0.5, 3.0] to prevent extreme gradient updates from the severely under-represented OS class.
+---
+## Training Configuration
+| Hyperparameter | Value |
+|----------------|-------|
+| Optimizer | AdamW |
+| Learning rate | 2e-5 (discriminative LR strategy) |
+| Weight decay | 0.01 |
+| Warmup steps | 10% of total steps |
+| LR schedule | Linear decay |
+| Batch size | 16 (grad accum × 2 = effective 32) |
+| Max epochs | 5 |
+| Early stopping patience | 2 epochs |
+| Max sequence length | 128 tokens |
+| Dropout (classification head) | 0.3 |
+| Label smoothing | 0.05 |
+| Class weight capping | [0.5, 3.0] |
+| Gradient clipping | 1.0 |
+| Loss | Class-weighted cross-entropy |
+Training took approximately 3,759 seconds (~62.7 minutes) on a single GPU.
+---
+## Evaluation Results
+### Test Set Performance
+| Class | Precision | Recall | F1-Score | Support |
+|-------|-----------|--------|----------|---------|
+| NON_OFFENSIVE | 0.7805 | 0.7701 | 0.7753 | 896 |
+| OTHER_OFFENSIVE | 0.6102 | 0.5926 | 0.6013 | 486 |
+| OFFENSIVE_RACIST | 0.4085 | 0.5918 | **0.4833** | 49 |
+| OFFENSIVE_SEXIST | 0.1739 | 0.2105 | 0.1905 | 19 |
+| **Macro Avg** | **0.4933** | **0.5413** | **0.5126** | 1,450 |
+| **Weighted Avg** | 0.7029 | 0.6972 | 0.6994 | 1,450 |
+| **Accuracy** | | | **0.6972** | 1,450 |
+### Validation Set Performance (Best Checkpoint)
+| Class | Precision | Recall | F1-Score | Support |
+|-------|-----------|--------|----------|---------|
+| NON_OFFENSIVE | 0.7961 | 0.8118 | 0.8039 | 356 |
+| OTHER_OFFENSIVE | 0.6609 | 0.5897 | 0.6233 | 195 |
+| OFFENSIVE_RACIST | 0.6727 | 0.8810 | 0.7629 | 42 |
+| OFFENSIVE_SEXIST | 0.8214 | 0.8519 | 0.8364 | 27 |
+| **Macro Avg** | **0.7378** | **0.7836** | **0.7566** | 620 |
+| **Accuracy** | | | **0.7484** | 620 |
+> NepaliBERT achieved the **highest validation macro F1 (0.7566)** among all evaluated models, outperforming even XLM-RoBERTa Large (0.7392 val macro F1). The validation-to-test gap is primarily explained by distributional shift in the OR and OS minority classes, not overfitting (train-val loss gap = 0.066).
+> **Primary metric:** Macro F1-score. Accuracy is misleading given class imbalance; macro F1 weights all classes equally, making it the appropriate metric for evaluating minority hate class performance.
+---
+## Training Dynamics
+Training proceeded over approximately 1,000 gradient steps in three phases:
+- **Phase 1 (steps 0–300):** Rapid co-descent of train and validation loss (1.50 → 1.00), faster than XLM-RoBERTa due to Nepali-specific pre-training. Validation F1 rises from 0.26 to 0.47.
+- **Phase 2 (steps 300–600):** Training loss continues declining (~0.90); validation loss stabilizes around 1.00–1.02. Validation F1 improves to 0.65 as OO and OR class discrimination refines.
+- **Phase 3 (steps 600–1000):** Validation F1 peaks near 0.75 at step 700, then settles at 0.72. Post-step-600 divergence between F1 and accuracy reflects a trade-off between majority class accuracy and minority class precision.
+The final train-validation loss gap of 0.066 confirms minimal overfitting; poor OS test performance is a data distribution issue rather than model overfitting.
+---
+## Comparison with Other Models
+| Approach | Model | Accuracy | Macro F1 |
+|----------|-------|----------|----------|
+| Classical ML | Logistic Regression (TF-IDF) | 0.7538 | 0.5701 |
+| Classical ML | SVM | 0.7552 | 0.5502 |
+| Deep Learning | GRU + Word2Vec | — | 0.3307 (test) |
+| Transformer | XLM-RoBERTa Large | 0.7034 | 0.5465 |
+| **Transformer** | **NepaliBERT (this model)** | **0.6972** | **0.5126** |
+### Per-Class F1 Comparison (Test Set)
+| Model | Macro F1 | NO | OO | OR | OS |
+|-------|----------|----|----|----|----|
+| Logistic Regression | 0.5701 | 0.8225 | 0.6722 | 0.5000 | 0.2857 |
+| SVM | 0.5502 | 0.8288 | 0.6659 | 0.4660 | 0.2400 |
+| XLM-RoBERTa Large | 0.5465 | 0.7825 | 0.6306 | 0.3731 | **0.4000** |
+| **NepaliBERT (this model)** | 0.5126 | 0.7753 | 0.6013 | **0.4833** | 0.1905 |
+> **Key finding:** NepaliBERT achieves the best OR class F1 (0.4833) among all models, outperforming XLM-RoBERTa Large (0.3731), confirming that Nepali domain pre-training provides a meaningful advantage for ethnicity/caste-related hate content. XLM-RoBERTa Large outperforms NepaliBERT on the OS class (0.4000 vs 0.1905).
+---
+## Limitations
+- **Romanized Nepali coverage:** NepaliBERT's pre-training corpus is predominantly Devanagari, limiting its ability to handle Romanized Nepali without prior transliteration. The OR test set contains 59.2% Romanized script vs 46.1% in training, contributing to the validation-to-test gap.
+- **OS class collapse:** With only 19 OS test samples, high length mismatch (train avg 13.1 words vs test avg 19.9 words), and narrow training vocabulary, OS results (F1 = 0.1905) should be interpreted with significant caution.
+- **Optimal checkpoint sensitivity:** NepaliBERT shows a more pronounced F1 peak-and-drop than XLM-RoBERTa, making it more sensitive to early stopping checkpoint selection.
+- **Preprocessing dependency:** Performance on Romanized or English inputs degrades without prior transliteration/translation through the preprocessing pipeline.
+- **Language scope:** Optimized specifically for Nepali. Not evaluated on other South Asian languages.
+---
+## Intended Use
+- Automated hate content moderation on Nepali social media platforms, especially where content is primarily in Devanagari script
+- Research on Nepali-specific NLP and low-resource hate speech detection
+- Comparative study of language-specific vs multilingual transformer models
+- Explainable AI integration — this model was evaluated with LIME, SHAP, and Captum-based Integrated Gradients for token-level attribution
+**Out-of-scope uses:** This model should not be used as the sole decision-making system for content removal without human review. OS class predictions carry particularly high uncertainty due to extremely limited test support.
+---
+## Explainability
+The deployment system integrates three complementary XAI methods for token-level explanation of predictions:
+- **LIME** — Local surrogate model via word masking perturbations
+- **SHAP** — Shapley value attribution (KernelSHAP)
+- **Integrated Gradients (Captum)** — Gradient-based attribution along input-to-baseline path
+---
+## Citation
+If you use this model, please cite the original dataset:
+```bibtex
+@inproceedings{niraula2021offensive,
+  title={Offensive Language Detection in Nepali Social Media},
+  author={Niraula, Nobal B. and Dulal, Saurav and Koirala, Diwa},
+  booktitle={Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)},
+  pages={67--75},
+  year={2021}
+}
+```
+And the base model:
+```bibtex
+@article{thapa2024nepali,
+  title={Development of Pre-trained Transformer-based Models for the Nepali Language},
+  author={Thapa, Prashant and Sharma, Prajwal and Kharel, Aman},
+  journal={Transactions on Asian and Low-Resource Language Information Processing},
+  year={2024}
+}
+```
+---
+## Authors
+**Uddav Rajbhandari**
+Department of Computer and Electronics Engineering
+Khwopa College of Engineering, Tribhuvan University, Nepal (2026)