File size: 2,577 Bytes
e657302
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
language:
- en
- hi
- ta
license: apache-2.0
tags:
- toxicity-detection
- multilingual
- indicbert
- text-classification
- content-moderation
datasets:
- custom
metrics:
- accuracy
- f1
widget:
- text: "You are kind and helpful"
- text: "Fuck you"
- text: "Bahut accha kiya yaar"
- text: "Bhenchod"
---

# IndicBERT Multilingual Toxicity Detector

Fine-tuned version of [ai4bharat/IndicBERTv2-MLM-only](https://huggingface.co/ai4bharat/IndicBERTv2-MLM-only) for toxicity detection in multilingual text (English, Hinglish, Hindi, Tamil).

## Model Description

This model classifies text as either **toxic** or **non-toxic**. It was trained on a balanced dataset with class weights to handle imbalanced data.

**Languages Supported:**
- English
- Hinglish (Hindi-English code-mixed)
- Hindi
- Tamil

## Training Details

- **Base Model:** ai4bharat/IndicBERTv2-MLM-only (278M parameters)
- **Training Data:** 569 samples (balanced: 53% non-toxic, 47% toxic)
- **Training Split:** 80/20 train/validation
- **Epochs:** 3
- **Batch Size:** 16
- **Learning Rate:** 2e-5
- **Class Weighting:** Applied to handle imbalance

## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("indic-toxicity-detector")
tokenizer = AutoTokenizer.from_pretrained("indic-toxicity-detector")

# Predict
def predict_toxicity(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    outputs = model(**inputs)
    probabilities = torch.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(probabilities, dim=-1).item()
    confidence = probabilities[0][predicted_class].item()
    
    label = model.config.id2label[predicted_class]
    return {"label": label, "confidence": confidence}

# Example
result = predict_toxicity("You are amazing!")
print(result)  # {'label': 'non-toxic', 'confidence': 0.95}
```

## Performance

- **Validation Accuracy:** See training_metrics.csv
- **F1 Score:** See training_metrics.csv

## Limitations

- Trained on limited dataset (569 samples)
- May not generalize well to all types of toxic content
- Performance varies across languages
- Code-mixed text performance depends on training data representation

## Citation

```bibtex
@misc{indic-toxicity-detector,
  author = {Your Name},
  title = {IndicBERT Multilingual Toxicity Detector},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/indic-toxicity-detector}
}
```

## License

Apache 2.0