--- language: - en - hi - ta license: apache-2.0 tags: - toxicity-detection - multilingual - indicbert - text-classification - content-moderation datasets: - custom metrics: - accuracy - f1 widget: - text: "You are kind and helpful" - text: "Fuck you" - text: "Bahut accha kiya yaar" - text: "Bhenchod" --- # IndicBERT Multilingual Toxicity Detector Fine-tuned version of [ai4bharat/IndicBERTv2-MLM-only](https://huggingface.co/ai4bharat/IndicBERTv2-MLM-only) for toxicity detection in multilingual text (English, Hinglish, Hindi, Tamil). ## Model Description This model classifies text as either **toxic** or **non-toxic**. It was trained on a balanced dataset with class weights to handle imbalanced data. **Languages Supported:** - English - Hinglish (Hindi-English code-mixed) - Hindi - Tamil ## Training Details - **Base Model:** ai4bharat/IndicBERTv2-MLM-only (278M parameters) - **Training Data:** 569 samples (balanced: 53% non-toxic, 47% toxic) - **Training Split:** 80/20 train/validation - **Epochs:** 3 - **Batch Size:** 16 - **Learning Rate:** 2e-5 - **Class Weighting:** Applied to handle imbalance ## Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch # Load model and tokenizer model = AutoModelForSequenceClassification.from_pretrained("indic-toxicity-detector") tokenizer = AutoTokenizer.from_pretrained("indic-toxicity-detector") # Predict def predict_toxicity(text): inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128) outputs = model(**inputs) probabilities = torch.softmax(outputs.logits, dim=-1) predicted_class = torch.argmax(probabilities, dim=-1).item() confidence = probabilities[0][predicted_class].item() label = model.config.id2label[predicted_class] return {"label": label, "confidence": confidence} # Example result = predict_toxicity("You are amazing!") print(result) # {'label': 'non-toxic', 'confidence': 0.95} ``` ## Performance - **Validation Accuracy:** See training_metrics.csv - **F1 Score:** See training_metrics.csv ## Limitations - Trained on limited dataset (569 samples) - May not generalize well to all types of toxic content - Performance varies across languages - Code-mixed text performance depends on training data representation ## Citation ```bibtex @misc{indic-toxicity-detector, author = {Your Name}, title = {IndicBERT Multilingual Toxicity Detector}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/indic-toxicity-detector} } ``` ## License Apache 2.0