SterlingAI-Video-Prompt-Filter

Model Card

old outdated simple filter im working on a new one I have no clue why 16 people downloaded this

This model is designed to categorize text into two classes: "safe", or "nsfw" (not safe for work) For my Ai Video Generator Sterling AI If you Want to use it for other things it may be suitable for content moderation and filtering applications.

The model was trained using a dataset containing 190,000 labeled text samples, distributed among the two classes of "safe" and "nsfw".

And finetuned to block stuff I block on Sterling Ai

The model is based on the Distilbert-base model.

In terms of performance, the model has achieved a score of 0.974 for F1 (40K exemples).

To improve the performance of the model, it is necessary to preprocess the input text. You can refer to the preprocess function in the app.py file in the following space: https://huggingface.co/spaces/eliasalbouzidi/distilbert-nsfw-text-classifier.

Model Description

The model can be used directly to classify text into one of the two classes. It takes in a string of text as input and outputs a probability distribution over the two classes. The class with the highest probability is selected as the predicted class.

Origional Moddel Developed by: Elias Al Bouzidi, Massine El Khader, Abdellah Oumida, Mohammed Sbaihi, Eliott Binard
Model type: 60M
Language (NLP): English
License: apache-2.0

Technical Paper:

A more detailed technical overview of the model and the dataset can be found here.

Uses

The model can be integrated into larger systems for content moderation or filtering.

Training Data

The training data for finetuning the text classification model consists of a large corpus of text labeled with one of the two classes: "safe" and "nsfw". The dataset contains a total of 190,000 examples, which are distributed as follows:

117,000 examples labeled as "safe"

63,000 examples labeled as "nsfw"

It was assembled by scraping data from the web and utilizing existing open-source datasets. A significant portion of the dataset consists of descriptions for images and scenes. The primary objective was to prevent diffusers from generating NSFW content but it can be used for other moderation purposes.

You can access the dataset : https://huggingface.co/datasets/eliasalbouzidi/NSFW-Safe-Dataset

Training Info

I did not want to share any of the training I did to modify this its a mess I do not want to explain or put here

Out-of-Scope Use

It should not be used for any illegal activities.

Bias, Risks, and Limitations

The model may exhibit biases based on the training data used. It may not perform well on text that is written in languages other than English. It may also struggle with sarcasm, irony, or other forms of figurative language. The model may produce false positives or false negatives, which could lead to incorrect categorization of text. This Moddel is relitiveley small so expect Some stuff to slip through the cracks. (not pun intended) and posably rewording may bypass it.

Recommendations

Users should be aware of the limitations and biases of the model and use it accordingly. They should also be prepared to handle false positives and false negatives. It is recommended to fine-tune the model for specific downstream tasks and to evaluate its performance on relevant datasets.

Load model directly

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("eliasalbouzidi/distilbert-nsfw-text-classifier")

model = AutoModelForSequenceClassification.from_pretrained("Sterling-Ai/SterlingAI-Video-Prompt-Filter")

Use a pipeline

from transformers import pipeline

pipe = pipeline("text-classification", model="Sterling-Ai/SterlingAI-Video-Prompt-Filter")

Citation

If you find our work useful, please consider citing us!

@misc{khader2025diffguardtextbasedsafetychecker,
      title={DiffGuard: Text-Based Safety Checker for Diffusion Models}, 
      author={Massine El Khader and Elias Al Bouzidi and Abdellah Oumida and Mohammed Sbaihi and Eliott Binard and Jean-Philippe Poli and Wassila Ouerdane and Boussad Addad and Katarzyna Kapusta},
      year={2025},
      eprint={2412.00064},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2412.00064}, 
}

Contact

Please reach out to eliasalbouzidi@gmail.com if you have any questions or feedback.

Downloads last month: 12

Safetensors

Model size

67M params

Tensor type

F32

Model tree for Sterling-Ai/SterlingAI-Video-Prompt-Filter

Base model

distilbert/distilbert-base-uncased

Finetuned

(12025)

this model

Dataset used to train Sterling-Ai/SterlingAI-Video-Prompt-Filter

Paper for Sterling-Ai/SterlingAI-Video-Prompt-Filter

DiffGuard: Text-Based Safety Checker for Diffusion Models

Paper • 2412.00064 • Published Nov 25, 2024 • 3

Evaluation results

F1 on NSFW-Safe-Dataset
self-reported

0.974
Accuracy on NSFW-Safe-Dataset
self-reported

0.980