BanglaBERT Hate Speech Detection - Production V3

Production V3 multi-label hate speech detection model for Bangla text.

Model Architecture

Base Model: BanglaBERT (sagorsarker/bangla-bert-base)
Architecture: Advanced Dual-Head with Label-Aware Attention
Task: Multi-label classification
Labels: 7 categories (vulgar, hate, religious, threat, troll, insult, safe)

Features

Label-Aware Attention: Each label has specialized attention mechanism
Multi-Scale Feature Extraction: 3 convolutional scales (kernel 3, 5, 7)
Label Co-occurrence Module: Captures inter-label relationships
Per-Label Threshold Optimization: Individual thresholds for each label
Safe Label Exclusivity: Intelligent conflict resolution

Usage

import torch
from transformers import AutoTokenizer
import pickle

# Load model and tokenizer
model_path = "tamim65/banglabert-hate-speech-prod-v3"
tokenizer = AutoTokenizer.from_pretrained(model_path)

# Load full model (custom architecture)
model = torch.load(f"{model_path}/full_model.pt", map_location='cpu')
model.eval()

# Load metadata and thresholds
with open(f"{model_path}/metadata.pkl", 'rb') as f:
    metadata = pickle.load(f)
with open(f"{model_path}/optimal_thresholds.pkl", 'rb') as f:
    optimal_thresholds = pickle.load(f)

# Predict
def predict(text):
    inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True, max_length=128)
    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.sigmoid(outputs).cpu().numpy()[0]
    
    # Apply thresholds
    predictions = {}
    for i, label in enumerate(metadata['label_names']):
        predictions[label] = {
            'probability': float(probs[i]),
            'predicted': bool(probs[i] >= optimal_thresholds[i])
        }
    
    return predictions

# Example
text = "আপনার মন্তব্য খুব সুন্দর"
result = predict(text)
print(result)

Performance

Optimized thresholds for each label
Handles multi-label scenarios effectively
Safe label exclusivity prevents false positives

Training Details

Optimizer: AdamW
Learning Rate: 2e-5
Batch Size: 16
Epochs: 10
Loss: Binary Cross Entropy with Logits

Files

full_model.pt: Complete model with custom architecture
model.safetensors: Model weights (safetensors format)
config.json: Model configuration
tokenizer.json, vocab.txt: Tokenizer files
metadata.pkl: Label names and metadata
optimal_thresholds.pkl: Per-label threshold values

Citation

If you use this model, please cite:

@misc{banglabert-hate-speech-v3,
  author = {Tamim},
  title = {BanglaBERT Hate Speech Detection - Production V3},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/tamim65/banglabert-hate-speech-prod-v3}}
}

License

MIT License

Downloads last month: 5

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support