BanglaBERT Hate Speech Detection - Production V3

Production V3 multi-label hate speech detection model for Bangla text.

Model Architecture

  • Base Model: BanglaBERT (sagorsarker/bangla-bert-base)
  • Architecture: Advanced Dual-Head with Label-Aware Attention
  • Task: Multi-label classification
  • Labels: 7 categories (vulgar, hate, religious, threat, troll, insult, safe)

Features

  • Label-Aware Attention: Each label has specialized attention mechanism
  • Multi-Scale Feature Extraction: 3 convolutional scales (kernel 3, 5, 7)
  • Label Co-occurrence Module: Captures inter-label relationships
  • Per-Label Threshold Optimization: Individual thresholds for each label
  • Safe Label Exclusivity: Intelligent conflict resolution

Usage

import torch
from transformers import AutoTokenizer
import pickle

# Load model and tokenizer
model_path = "tamim65/banglabert-hate-speech-prod-v3"
tokenizer = AutoTokenizer.from_pretrained(model_path)

# Load full model (custom architecture)
model = torch.load(f"{model_path}/full_model.pt", map_location='cpu')
model.eval()

# Load metadata and thresholds
with open(f"{model_path}/metadata.pkl", 'rb') as f:
    metadata = pickle.load(f)
with open(f"{model_path}/optimal_thresholds.pkl", 'rb') as f:
    optimal_thresholds = pickle.load(f)

# Predict
def predict(text):
    inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True, max_length=128)
    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.sigmoid(outputs).cpu().numpy()[0]
    
    # Apply thresholds
    predictions = {}
    for i, label in enumerate(metadata['label_names']):
        predictions[label] = {
            'probability': float(probs[i]),
            'predicted': bool(probs[i] >= optimal_thresholds[i])
        }
    
    return predictions

# Example
text = "আপনার মন্তব্য খুব সুন্দর"
result = predict(text)
print(result)

Performance

  • Optimized thresholds for each label
  • Handles multi-label scenarios effectively
  • Safe label exclusivity prevents false positives

Training Details

  • Optimizer: AdamW
  • Learning Rate: 2e-5
  • Batch Size: 16
  • Epochs: 10
  • Loss: Binary Cross Entropy with Logits

Files

  • full_model.pt: Complete model with custom architecture
  • model.safetensors: Model weights (safetensors format)
  • config.json: Model configuration
  • tokenizer.json, vocab.txt: Tokenizer files
  • metadata.pkl: Label names and metadata
  • optimal_thresholds.pkl: Per-label threshold values

Citation

If you use this model, please cite:

@misc{banglabert-hate-speech-v3,
  author = {Tamim},
  title = {BanglaBERT Hate Speech Detection - Production V3},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/tamim65/banglabert-hate-speech-prod-v3}}
}

License

MIT License

Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support