--- library_name: transformers tags: - toxicity - hindi license: cc datasets: - Polygl0t/hindi-toxicity-qwen-annotations language: - hi metrics: - precision - recall - accuracy model-index: - name: hindi-roberta-toxicity-classifier results: [] pipeline_tag: text-classification base_model: - l3cube-pune/hindi-roberta --- # hindi-roberta Toxicity Classifier hindi-roberta-toxicity-classifier is an [HindRoBERTa](https://huggingface.co/l3cube-pune/hindi-roberta) based model that can be used for judging the toxicity level of a given Hindi text string. This model was trained on the [Polygl0t/hindi-toxicity-qwen-annotations](https://huggingface.co/datasets/Polygl0t/hindi-toxicity-qwen-annotations) dataset. ## Details For training, we added a classification head with a single regression output to [l3cube-pune/hindi-roberta](https://huggingface.co/l3cube-pune/hindi-roberta). Only the classification head was trained, i.e., the rest of the model was frozen. - **Dataset:** [hindi-toxicity-qwen-annotations](https://huggingface.co/datasets/Polygl0t/hindi-toxicity-qwen-annotations) - **Language:** Hindi - **Number of Training Epochs:** 20 - **Batch size:** 256 - **Optimizer:** `torch.optim.AdamW` - **Learning Rate:** 3e-4 - **Eval Metric:** `f1-score` This repository has the [source code](https://github.com/Polygl0t/llm-foundry) used to train this model. ### Evaluation Results #### Confusion Matrix | | **1** | **2** | **3** | **4** | **5** | |-------|-------|-------|-------|-------|-------| | **1** | 11526 | 2601 | 134 | 7 | 0 | | **2** | 722 | 1713 | 281 | 10 | 0 | | **3** | 240 | 1092 | 590 | 7 | 2 | | **4** | 21 | 242 | 308 | 104 | 13 | | **5** | 5 | 46 | 78 | 68 | 123 | - Precision: 0.58656 - Recall: 0.45341 - F1 Macro: 0.47433 - Accuracy: 0.7028 ## Usage Here's an example of how to use the Toxicity Classifier: ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch device = torch.device("cuda" if torch.cuda.is_available() else "cpu") tokenizer = AutoTokenizer.from_pretrained("Polygl0t/hindi-roberta-toxicity-classifier") model = AutoModelForSequenceClassification.from_pretrained("Polygl0t/hindi-roberta-toxicity-classifier") model.to(device) text = "यह एक उदाहरण है।" encoded_input = tokenizer(text, return_tensors="pt", padding="longest", truncation=True).to(device) with torch.no_grad(): model_output = model(**encoded_input) logits = model_output.logits.squeeze(-1).float().cpu().numpy() # scores are produced in the range [0, 4]. To convert to the range [1, 5], we can simply add 1 to the score. score = [x + 1 for x in logits.tolist()][0] print({ "text": text, "score": score, "int_score": [int(round(max(0, min(score, 4)))) + 1 for score in logits][0], }) ``` ## Cite as 🤗 ```latex @misc{shiza2026lilmoo, title={{Raising Bars, Not Parameters: LilMoo Compact Language Model for Hindi}}, author={Shiza Fatimah and Aniket Sen and Sophia Falk and Florian Mai and Lucie Flek and Nicholas Kluge Corr{\^e}a}, year={2026}, eprint={2603.03508}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2603.03508}, } ``` ## Aknowlegments Polyglot is a project funded by the Federal Ministry of Education and Research (BMBF) and the Ministry of Culture and Science of the State of North Rhine-Westphalia (MWK) as part of TRA Sustainable Futures (University of Bonn) and the Excellence Strategy of the federal and state governments. We also gratefully acknowledge the granted access to the [Marvin cluster](https://www.hpc.uni-bonn.de/en/systems/marvin) hosted by [University of Bonn](https://www.uni-bonn.de/en) along with the support provided by its High Performance Computing & Analytics Lab. ## License According to [l3cube-pune/hindi-roberta](https://huggingface.co/l3cube-pune/hindi-roberta), the model is released under [cc-by-4.0](https://spdx.org/licenses/CC-BY-4.0). For any queries, please get in touch with the authors of the original paper tied to [hindi-roberta](https://huggingface.co/l3cube-pune).