toxicity_rubert / README.md

fasherr

Update README.md

c83897b verified 13 days ago

preview code

raw

history blame contribute delete

1.55 kB

metadata

language: ru
library_name: transformers
pipeline_tag: text-classification
tags:
  - toxicity
  - safetensors
base_model:
  - DeepPavlov/rubert-base-cased-conversational

A model for toxicity classification in Russian texts.
Fine-tuned based on the DeepPavlov/rubert-base-cased-conversational model.

It's a binary classifier designed to detect toxicity in text.

Label 0 (NEUTRAL): Neutral text
Label 1 (TOXIC): Toxic text / Insults / Threats

Dataset

This model was trained on two datasets:

Toxic Russian Comments

Russian Language Toxic Comments

Usage

from transformers import pipeline

classifier = pipeline("text-classification", model="fasherr/toxicity_rubert")
text_1 = "Ты сегодня прекрасно выглядишь!"
text_2 = "Ты очень плохой человек"
print(classifier(text_1))
# [{'label': 'NEUTRAL', 'score': 0.99...}]
print(classifier(text_2))
#[{'label': 'TOXIC', 'score': 1}]

Eval results

	Accuracy	Precision	Recall	F1-Score	AUC-ROC	Support
Overall (Macro)	97.93%	96.37%	96.86%	96.61%	0.9962	26271
Neutral	97.93%	98.88%	98.57%	98.72%	0.9962	21347
Toxic	97.93%	93.87%	95.15%	94.50%	0.9962	4924