Instructions to use leoole/spoiler-detector with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use leoole/spoiler-detector with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("leoole/spoiler-detector", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - sentence-transformers
How to use leoole/spoiler-detector with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("leoole/spoiler-detector") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
| library_name: sklearn | |
| tags: | |
| - text-classification | |
| - sentence-transformers | |
| - spoiler-detection | |
| - sklearn | |
| - movie-reviews | |
| license: mit | |
| # Multi-Source Spoiler Detector | |
| This repository contains the trained classifier for a three-level movie-review spoiler detector. | |
| ## Task | |
| The model predicts one of three labels: | |
| - `Safe`: no meaningful spoiler detected | |
| - `Mild`: broad setup, tone, or non-critical plot information | |
| - `Major`: key twist, death, identity, ending, solution, or final outcome revealed | |
| ## Model | |
| - Classifier: SVM with RBF kernel (`sklearn.svm.SVC`) | |
| - Embeddings: `sentence-transformers/all-mpnet-base-v2` | |
| - Input: English movie-review text | |
| - Output: `Major`, `Mild`, or `Safe` | |
| The serialized model is stored in `best_model.joblib`. It contains both the trained classifier and metadata with the embedding model name and label classes. | |
| ## Test Results | |
| | Model | Accuracy | Macro F1 | Weighted F1 | | |
| |---|---:|---:|---:| | |
| | SVM RBF | 0.5753 | 0.5723 | 0.5752 | | |
| | Logistic Regression | 0.5669 | 0.5706 | 0.5661 | | |
| | MLP | 0.5690 | 0.5640 | 0.5670 | | |
| | Random Forest | 0.5314 | 0.4166 | 0.4434 | | |
| Best test model: **SVM RBF**. | |
| ## Usage | |
| ```python | |
| import joblib | |
| from sentence_transformers import SentenceTransformer | |
| payload = joblib.load("best_model.joblib") | |
| model = payload["model"] | |
| metadata = payload["metadata"] | |
| classes = metadata["label_classes"] | |
| embedder = SentenceTransformer(metadata["embedding_model"]) | |
| text = "The final scene reveals that the detective was the killer all along." | |
| X = embedder.encode([text], convert_to_numpy=True, normalize_embeddings=True) | |
| label_id = int(model.predict(X)[0]) | |
| print(classes[label_id]) | |
| ``` | |
| ## Data | |
| The training data was built from IMDb reviews and GPT-generated synthetic review snippets. GPT was also used to assign Mild/Major severity labels for IMDb spoiler reviews. A manual quality check of 100 sampled Mild/Major labels found 93% exact agreement. | |
| ## Limitations | |
| Spoiler severity is subjective, especially between Mild and Major. Synthetic examples can also differ stylistically from real user reviews, so results should be interpreted as a course-project prototype rather than a production moderation system. | |