Jensvollends
/

hatebert-finetuned_v5

Model card Files Files and versions

Jensvollends commited on Jun 17, 2025

Commit

d369ee2

·

verified ·

1 Parent(s): 88d0a2f

Create README.md

Files changed (1) hide show

README.md +55 -0

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+license: cc-by-sa-4.0
+tags:
+  - hate-speech
+  - toxic-comments
+  - classification
+  - hatebert
+  - jigsaw
+  - fine-tuned
+base_model: GroNLP/hateBERT
+datasets:
+  - jigsaw-toxic-comment-classification-challenge
+metrics:
+  - accuracy
+  - f1
+---
+# HateBERT Fine-Tuned on Jigsaw Toxic Comments (v5)
+This model is a fine-tuned version of [GroNLP/hateBERT](https://huggingface.co/GroNLP/hateBERT) on a binary version of the [Jigsaw Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) dataset.
+It has been fine-tuned to detect whether a comment is toxic (`1`) or non-toxic (`0`) using class-weighted Focal Loss and evaluation strategies suitable for imbalanced classification tasks.
+## 💻 Training Setup
+- **Base Model:** GroNLP/hateBERT
+- **Dataset:** Jigsaw Toxic Comment Classification Challenge
+- **Binary Labeling:** A comment is marked as *toxic* if any of the following labels is `1`: `toxic`, `severe_toxic`, `obscene`, `threat`, `insult`, `identity_hate`
+- **Tokenizer Max Length:** 256
+- **Loss Function:** Focal Loss with class weights
+- **Hardware:** NVIDIA H100 GPU (via SLURM on TU Berlin HPC)
+- **Training Time:** ~6 hours
+- **Final F1 Score (Validation):** `0.850`
+## 📊 Evaluation Metrics
+| Metric   | Value  |
+|----------|--------|
+| F1 Score | 0.850  |
+| Accuracy | ~0.84  |
+| Confusion Matrix & PR Curves | [Saved and visualized during training] |
+## 🧪 How to Use
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
+model = AutoModelForSequenceClassification.from_pretrained("Jensvollends/hatebert-finetuned_v5")
+tokenizer = AutoTokenizer.from_pretrained("Jensvollends/hatebert-finetuned_v5")
+pipe = pipeline("text-classification", model=model, tokenizer=tokenizer, top_k=None)
+text = "You are a kind person"
+result = pipe(text)
+print(result)