celalkartoglu's picture
Upload README.md with huggingface_hub
f8f46c7 verified

Türkçe Toksisite Sınıflandırma — BERTurk (5 sınıf)

Etiketler: DİĞER, KÜFÜR, HAKARET, IRKÇI, CİNSİYETÇİ
Veri: Overfit-GM/turkish-toxic-language (77,800 örnek)
Taban model: dbmdz/bert-base-turkish-cased
Eğitim: 3 epoch, bs=16/32, lr=3e-5, class_weight ile dengeleme

Değerlendirme (Validation)

  • Accuracy: 0.9563
  • Macro-F1: 0.9185
  • Loss: 0.4408

Kullanım

from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline
tok = AutoTokenizer.from_pretrained("celalkartoglu/tr-toxic-bert-multiclass-v1")
mdl = AutoModelForSequenceClassification.from_pretrained("celalkartoglu/tr-toxic-bert-multiclass-v1")
pipe = TextClassificationPipeline(model=mdl, tokenizer=tok, top_k=1)
pipe("Defol git buradan.")  # -> [{'label': 'HAKARET', 'score': ...}]