prithivMLmods's picture
Update README.md
c01bc4b verified
metadata
license: cc-by-nc-4.0
language:
  - en
base_model:
  - facebook/metaclip-2-worldwide-s16
pipeline_tag: image-classification
library_name: transformers
tags:
  - text-generation-inference
  - Content-filtering
model-index:
  - name: Nsfw_Image_Detection_OSS
    results:
      - task:
          type: image-classification
        dataset:
          type: evaluation
          name: NSFW Image Detection Benchmark
        metrics:
          - type: accuracy
            value: 0.8918
            name: Accuracy
          - type: f1
            value: 0.9071
            name: F1 (NSFW)
          - type: precision
            value: 0.9047
            name: Precision (NSFW)
          - type: recall
            value: 0.9094
            name: Recall (NSFW)
          - type: f1
            value: 0.8705
            name: F1 (SFW)
          - type: precision
            value: 0.8736
            name: Precision (SFW)
          - type: recall
            value: 0.8673
            name: Recall (SFW)
          - type: f1_macro
            value: 0.8888
            name: Macro F1
          - type: f1_weighted
            value: 0.8917
            name: Weighted F1

1

Nsfw_Image_Detection_OSS

Nsfw_Image_Detection_OSS is an image classification vision-language encoder model fine-tuned from facebook/metaclip-2-worldwide-s16 for a binary NSFW detection task. It is designed to classify whether an image is Safe For Work (SFW) or Not Safe For Work (NSFW) using the MetaClip2ForImageClassification architecture.

MetaCLIP 2: A Worldwide Scaling Recipe https://huggingface.co/papers/2507.22062

Evaluation Report (Self-Reported)

Classification report:

              precision    recall  f1-score   support

         SFW     0.8736    0.8673    0.8705     11103
        NSFW     0.9047    0.9094    0.9071     15380

    accuracy                         0.8918     26483
   macro avg     0.8892    0.8884    0.8888     26483
weighted avg     0.8917    0.8918    0.8917     26483

download

Label Mapping

The model categorizes images into two classes:

  • Class 0: SFW
  • Class 1: NSFW
{
  "id2label": {
    "0": "SFW",
    "1": "NSFW"
  },
  "label2id": {
    "SFW": 0,
    "NSFW": 1
  }
}

Run with Transformers

!pip install -q transformers torch pillow gradio
import gradio as gr
import torch
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image

# Model name from Hugging Face Hub
model_name = "prithivMLmods/Nsfw_Image_Detection_OSS"

# Load processor and model
processor = AutoImageProcessor.from_pretrained(model_name)
model = AutoModelForImageClassification.from_pretrained(model_name)
model.eval()

# Define labels
LABELS = {
    0: "SFW",
    1: "NSFW"
}

def nsfw_detection(image):
    """Predict whether an image is SFW or NSFW."""
    image = Image.fromarray(image).convert("RGB")
    inputs = processor(images=image, return_tensors="pt")

    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

    predictions = {LABELS[i]: round(probs[i], 3) for i in range(len(probs))}
    return predictions

# Build Gradio interface
iface = gr.Interface(
    fn=nsfw_detection,
    inputs=gr.Image(type="numpy", label="Upload Image"),
    outputs=gr.Label(label="NSFW Detection Probabilities"),
    title="NSFW Image Detection (MetaCLIP-2)",
    description="Upload an image to classify whether it is Safe For Work (SFW) or Not Safe For Work (NSFW)."
)

# Launch app
if __name__ == "__main__":
    iface.launch()

Intended Use

The Nsfw_Image_Detection_OSS model is designed to classify images into SFW or NSFW categories.

Potential use cases include:

  • Content Moderation: Automated filtering of unsafe or adult content.
  • Social Media Platforms: Preventing the upload of explicit media.
  • Enterprise Safety: Ensuring workplace-appropriate content in shared environments.
  • Dataset Filtering: Cleaning large-scale image datasets before training.
  • Parental Control Systems: Blocking inappropriate visual material.