Update README.md

3bfa950 verified 3 months ago

1.75 kB

license: mit
language:
  - de
metrics:
  - accuracy
  - f1
  - roc_auc
base_model:
  - german-nlp-group/electra-base-german-uncased
pipeline_tag: text-classification

Horbee/electra-german-offensive-comment-classifier aka SauerELECTRA

This model is a transformer-based ELECTRA model fine-tuned for detecting offensive German comments. It was trained on 13,000 examples of GermEval 2018 and 2019 with an unbalanced class distribution, using class weights to correct for bias. The model demonstrates strong performance on German comment moderation tasks, with the following metrics:

Accuracy: 83.6%
F1: 0.785
Precision: 0.782
Recall: 0.788
AUC: 0.908

SauerELECTRA is designed to help detect offensive language, and rude comments in German text, making it suitable for moderation systems, research, or content analysis pipelines.

Intended Use:

Detection of offensive, or inappropriate German-language comments

Social media moderation tools

Example Use:

from transformers import pipeline

classifier = pipeline("text-classification",
                      model="Horbee/electra-german-offensive-comment-classifier")

sequence_to_classify = "Ich kann es nicht ausstehen, mit so einem Idioten im selben Raum zu sein."

result = classifier(sequence_to_classify)

print(result) # [{'label': 'Offensive', 'score': 0.9302035570144653}]

Limitations:

Trained only on GermEval 2018/2019 data — performance on out-of-domain or highly informal texts may vary.

May not capture all forms of subtle toxicity or sarcasm.

Designed for German-language content; not suitable for other languages.

Author comments

Thank you for using my model, let me know if it helped you out. I would appreciate any constructive feedback.