Slop-Detector-v1

This is a fine-tuned version of answerdotai/ModernBERT-large designed to classify text based on its "slop" level—the density of AI-typical clichés, "purple prose," and repetitive vocabulary.

It was trained for 3 epochs on the DrRiceIO7/SlopReview v1 dataset, which contains 4,579 examples of human and AI-generated text scored by a weighted slop dictionary.

Stay tuned for a better model soon.

Classification Labels

The model classifies text into five categories based on the original dataset's "Slop Score" logic:

Label Description
Virtually Human Natural phrasing, avoids AI clichés, uses human markers.
Clean High-quality writing with minimal AI-typical vocabulary.
Noticeable Slop Contains several AI "fingerprints" or repetitive conceptual words.
Egregious Slop Heavy reliance on AI-typical names (e.g., Elara, Elias) and clichés.
Absolute Slop Overload Extremely dense with AI markers; stereotypical "bot" output.

How to Use

You can use this model with the Hugging Face transformers library:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "DrRiceIO7/Slop-Detector-v1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

text = "The lighthouse at the edge of the world did not guide ships. Its beam did not cut through fog to warn sailors of jagged rocks or hidden shoals. Instead, the Spire of Oakhaven..."

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=1024)
with torch.no_grad():
    logits = model(**inputs).logits

predicted_class_id = logits.argmax().item()
print(f"Result: {model.config.id2label[predicted_class_id]}")

Training Details

  • Base Model: ModernBERT-large (28 layers, 1024 hidden size)
  • Dataset: DrRiceIO7/SlopReview (4.5k rows)
  • Epochs: 3
  • Precision: bfloat16
  • Max Sequence Length: 512 tokens
  • Hardware: Trained on Intel Arc B580 (Battlemage) using PyTorch XPU and SDPA.

Limitations & Bias

This model is a "v1" and its definitions of "slop" are based on the specific weighted dictionary provided in the SlopReview dataset. It may flag high-quality creative writing as "slop" if it uses common literary tropes that LLMs also happen to favor (e.g., "shimmering," "echoed"). Conversely, very simple AI output without "purple prose" might be flagged as "Virtually Human."

Downloads last month
103
Safetensors
Model size
0.4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DrRiceIO7/Slop-Detector-v1

Finetuned
(254)
this model