fasherr
/

toxicity_rubert

Text Classification

text-embeddings-inference

Model card Files Files and versions

fasherr commited on Feb 11

Commit

79d0dc3

·

verified ·

1 Parent(s): 999724b

Update README.md

Files changed (1) hide show

README.md +37 -1

README.md CHANGED Viewed

@@ -14,4 +14,40 @@ widget:
   example_title: NEUTRAL - 100.00%
 ---
-Тун тун

   example_title: NEUTRAL - 100.00%
 ---
+A model for toxicity classification in Russian texts.
+Fine-tuned based on the [DeepPavlov/rubert-base-cased-conversational](https://huggingface.co/DeepPavlov/rubert-base-cased-conversational) model.
+It's a binary classifier designed to detect toxicity in text.
+* **Label 0 (NEUTRAL):** Neutral text
+* **Label 1 (TOXIC):** Toxic text / Insults / Threats
+**Dataset**
+This model was trained on two datasets:
+  [Toxic Russian Comments](https://www.kaggle.com/datasets/alexandersemiletov/toxic-russian-comments)
+  [Russian Language Toxic Comments](https://www.kaggle.com/datasets/blackmoon/russian-language-toxic-comments)
+**Usage**
+```python
+from transformers import pipeline
+classifier = pipeline("text-classification", model="fasherr/toxicity_rubert")
+text_1 = "Ты сегодня прекрасно выглядишь!"
+text_2 = "Ты очень плохой человек"
+print(classifier(text_1))
+# [{'label': 'NEUTRAL', 'score': 0.99...}]
+print(classifier(text_2))
+#[{'label': 'TOXIC', 'score': 1}]
+```
+**Eval results**
+|| Accuracy | Precision | Recall | F1-Score | AUC-ROC | Support |
+| :--- | :---: | :---: | :---: | :---: | :---: | :---: |
+| **Overall (Macro)** | 97.93% | 96.37% | 96.86% | 96.61% | 0.9962 | 26271 |
+| **Neutral** | 97.93% | 98.88% | 98.57% | 98.72% | 0.9962 | 21347 |
+| **Toxic** | 97.93% | 93.87% | 95.15% | 94.50% | 0.9962 | 4924 |