DISCO: Detection of Implicit Suggestive Content Overlays
Kids are increasingly exposed to suggestive content on social media and the big companies don't do enough against it. E.g. Roblox and Youtube kids.
To combat sexualisation of content on social media which is targetted towards kids, I trained a model to detect suggestive content. This can hopefully be extended to either detect content that is not suitable for kids during preprocessing or even during real time on device inference. (maybe in the future)
This was just a small weekend project, however if it seems to be somewhat useful, I might extend it in the future.
Baseline comparison
I wanted to see how much more accurate my model is than a popular NSFW detection model. I used the Falconsai/nsfw_image_detection model to compare.
To my surprise, the NSFW model did really bad at detecting suggestive content. I understand it wasn't built for this purpose, however I expected it to at least do better than random.
see baseline.py and eda.ipynb for the implementation.
Dataset
For training the model, I used iMaterialist dataset from Kaggle. I took a small subset of images and labled them myself using Label Studio.
Sadly, the terms of the dataset mention that I may not distribute the dataset, nor use it for commercial purposes (which I didnt plan to do anyways).
This however does mean that the model is not fully reproducible in this state, unless you accept the terms of the dataset and download it yourself. If this model seems to be useful, I will try to find a different approach to gathering data and training the model.
Methodology
I labled in total 455 images, with 3 categories to begin with: FAMILY_SAFE, UNCERTAIN and SUGGESTIVE.
For the training however I combined FAMILY_SAFE and UNCERTAIN into a single category called FAMILY_SAFE/UNCERTAIN. This is a point of improvement for the future.
The default split is as follows: 70% for training, 15% for validation and 15% for testing.
Model
The model itself combines a CLIP image encoder with a linear classifier. (super simple)
In the forward pass, I generate the image features using the CLIP model and then pass them through the linear classifier.
Then to predict the probability of the SUGGESTIVE class, I use the softmax function.
proba = torch.softmax(logits, dim=-1)[0, 1].item()
And then I can predict the class by thresholding the probability.
pred = (proba >= threshold).long()
How to Use
Installation
uv add transformers torch pillow
# or if youre a dinosoar and like slow code: pip install transformers torch pillow
Quick Start
Since this is a custom model, you need to import the model class. Here's the easiest way:
import torch
from transformers import CLIPProcessor
from huggingface_hub import hf_hub_download
import json
import importlib.util
from PIL import Image
import requests
from io import BytesIO
# Download and import the model class
modeling_file = hf_hub_download(repo_id="younissk/DISCO-v0.1", filename="modeling_disco.py")
spec = importlib.util.spec_from_file_location("modeling_disco", modeling_file)
modeling_disco = importlib.util.module_from_spec(spec)
spec.loader.exec_module(modeling_disco)
DISCO = modeling_disco.DISCO
# Load model and processor
model = DISCO.from_pretrained("younissk/DISCO-v0.1")
processor = CLIPProcessor.from_pretrained("younissk/DISCO-v0.1")
# Load threshold from metadata
metadata_path = hf_hub_download(repo_id="younissk/DISCO-v0.1", filename="model_metadata.json")
with open(metadata_path, "r") as f:
metadata = json.load(f)
threshold = metadata["threshold"]
# Load image (from URL or local file)
image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png"
image = Image.open(BytesIO(requests.get(image_url).content))
# Or use local file: image = Image.open("path/to/image.jpg")
# Preprocess and predict
inputs = processor(images=image, return_tensors="pt")
model.eval()
with torch.no_grad():
logits = model(**inputs)
proba = torch.softmax(logits, dim=-1)
suggestive_prob = proba[0, 1].item()
is_suggestive = suggestive_prob >= threshold
print(f"Suggestive content probability: {suggestive_prob:.3f}")
print(f"Is suggestive (threshold={threshold:.3f}): {is_suggestive}")
Alternative: Direct Import (if you have the repo locally)
from PIL import Image
from src.inference import run_DISCO
image = Image.open("path/to/image.jpg")
proba = run_DISCO(image)
print(f"Suggestive content probability: {proba:.3f}")
Note: The standard pipeline API won't work with this custom model. You need to use the DISCO class directly as shown above.
Training
I trained the model using mps instead of CPU or cuda.
It is trained on 10 epochs with a batch size of 32, a learning rate of 1e-3, a weight decay of 1e-4 and a class weight of "balanced". It's trained on 2 classed instead of 3 and it is initialised with a threshold of 0.5 (meaning if an image has a probability of 0.5 or higher, it is predicted as SUGGESTIVE).
During the training, I valvulate the model on the validation set and save the best model based on the F1 score. If it beats the previous best F1 score, I save the model.
After the training, I tune the threshold on the validation set and save the best threshold based on the F1 score.
best_threshold, threshold_metrics = tune_threshold(
val_labels_np, val_scores, metric="f1")
Results
Known problems and future improvements
I am aware that my own labelling is not perfect and subjective. Therefore the biggest improvement would be to have a more objective labelling process. (Add more annotators and images to the dataset)
Another big problem is lisencing and copyright issues. My goal is to make this model better detect what is actually shown to kids, such as things like elsa and spiderman in suggestive content.
Other than that, my goal is to make this model work fast on browsers for the future possibility to add it to a chrome extension to block content in real time.
Model tree for younissk/DISCO-v0.1
Base model
openai/clip-vit-base-patch32Evaluation results
- Test ROC-AUC on Custom labelled imaterialist fashion datasetself-reported0.916
- Test PR-AUC on Custom labelled imaterialist fashion datasetself-reported0.921
- F1 Score on Custom labelled imaterialist fashion datasetself-reported0.897
- Precision on Custom labelled imaterialist fashion datasetself-reported0.833
- Recall on Custom labelled imaterialist fashion datasetself-reported0.972


