@AuricErgeson on Hugging Face: " I trained a hate speech detector that catches coded language Most existing…"

Post

140

I trained a hate speech detector that catches coded language

Most existing models miss stuff like "they control the media" or
"heil hitler" they were trained on explicit slurs only.

I fused 4 datasets (Davidson, ImplicitHate, HateXplain, HateDay 2025)
+ targeted augmentation for neo-Nazi codes, antisemitic dog whistles,
and white nationalist phrases.

Results on 11K held-out examples:
- neither: F1 0.884
- offensive: F1 0.870
- hate_speech: F1 0.697 ← hardest class, still beats most baselines

Model: AuricErgeson/hate-speech-detector
Try it: AuricErgeson/hate-speech-detector

One thing it still misses: bare "1488" as a standalone token.
If you've solved this open an issue, I'm curious.

#NLP #HateSpeechDetection #ContentModeration #TextClassification

Join the conversation