Multi-Source Spoiler Detector

This repository contains the trained classifier for a three-level movie-review spoiler detector.

Task

The model predicts one of three labels:

  • Safe: no meaningful spoiler detected
  • Mild: broad setup, tone, or non-critical plot information
  • Major: key twist, death, identity, ending, solution, or final outcome revealed

Model

  • Classifier: SVM with RBF kernel (sklearn.svm.SVC)
  • Embeddings: sentence-transformers/all-mpnet-base-v2
  • Input: English movie-review text
  • Output: Major, Mild, or Safe

The serialized model is stored in best_model.joblib. It contains both the trained classifier and metadata with the embedding model name and label classes.

Test Results

Model Accuracy Macro F1 Weighted F1
SVM RBF 0.5753 0.5723 0.5752
Logistic Regression 0.5669 0.5706 0.5661
MLP 0.5690 0.5640 0.5670
Random Forest 0.5314 0.4166 0.4434

Best test model: SVM RBF.

Usage

import joblib
from sentence_transformers import SentenceTransformer

payload = joblib.load("best_model.joblib")
model = payload["model"]
metadata = payload["metadata"]
classes = metadata["label_classes"]

embedder = SentenceTransformer(metadata["embedding_model"])
text = "The final scene reveals that the detective was the killer all along."
X = embedder.encode([text], convert_to_numpy=True, normalize_embeddings=True)
label_id = int(model.predict(X)[0])
print(classes[label_id])

Data

The training data was built from IMDb reviews and GPT-generated synthetic review snippets. GPT was also used to assign Mild/Major severity labels for IMDb spoiler reviews. A manual quality check of 100 sampled Mild/Major labels found 93% exact agreement.

Limitations

Spoiler severity is subjective, especially between Mild and Major. Synthetic examples can also differ stylistically from real user reviews, so results should be interpreted as a course-project prototype rather than a production moderation system.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support