| | --- |
| | language: ru |
| | library_name: transformers |
| | pipeline_tag: text-classification |
| | tags: |
| | - toxicity |
| | - safetensors |
| | base_model: |
| | - DeepPavlov/rubert-base-cased-conversational |
| | --- |
| | |
| | A model for toxicity classification in Russian texts. |
| | Fine-tuned based on the [DeepPavlov/rubert-base-cased-conversational](https://huggingface.co/DeepPavlov/rubert-base-cased-conversational) model. |
| |
|
| | It's a binary classifier designed to detect toxicity in text. |
| |
|
| | * **Label 0 (NEUTRAL):** Neutral text |
| | * **Label 1 (TOXIC):** Toxic text / Insults / Threats |
| |
|
| | **Dataset** |
| |
|
| | This model was trained on two datasets: |
| |
|
| | [Toxic Russian Comments](https://www.kaggle.com/datasets/alexandersemiletov/toxic-russian-comments) |
| | |
| | [Russian Language Toxic Comments](https://www.kaggle.com/datasets/blackmoon/russian-language-toxic-comments) |
| |
|
| |
|
| | **Usage** |
| |
|
| | ```python |
| | from transformers import pipeline |
| | |
| | classifier = pipeline("text-classification", model="fasherr/toxicity_rubert") |
| | text_1 = "Ты сегодня прекрасно выглядишь!" |
| | text_2 = "Ты очень плохой человек" |
| | print(classifier(text_1)) |
| | # [{'label': 'NEUTRAL', 'score': 0.99...}] |
| | print(classifier(text_2)) |
| | #[{'label': 'TOXIC', 'score': 1}] |
| | ``` |
| |
|
| | **Eval results** |
| | || Accuracy | Precision | Recall | F1-Score | AUC-ROC | Support | |
| | | :--- | :---: | :---: | :---: | :---: | :---: | :---: | |
| | | **Overall (Macro)** | 97.93% | 96.37% | 96.86% | 96.61% | 0.9962 | 26271 | |
| | | **Neutral** | 97.93% | 98.88% | 98.57% | 98.72% | 0.9962 | 21347 | |
| | | **Toxic** | 97.93% | 93.87% | 95.15% | 94.50% | 0.9962 | 4924 | |
| |
|