Update README.md

f807253 verified 4 months ago

4.75 kB

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: image-classification
	library_name: transformers
	tags:
	- text-generation-inference
	- siglip2
	- image-filter
	- safe-image-moderation
	- adult-content-filter
	- content-safety
	- anime-detection
	- ai-safety
	base_model:
	- prithivMLmods/Image-Guard-ckpt-3312
	---

	![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/6YHXksHJyzT64KbQH71Ga.png)

	# Image-Guard-2.0-Post0.1

	> Image-Guard-2.0-Post0.1 is a multiclass image safety classification model fine-tuned from google/siglip2-base-patch16-224.
	> It classifies images into multiple safety-related categories using the SiglipForImageClassification architecture.

	> [!note]
	> SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
	> [https://arxiv.org/pdf/2502.14786](https://arxiv.org/pdf/2502.14786)


	```py
	Classification report:
	precision recall f1-score support

	Anime-SFW 0.8906 0.8766 0.8835 5600
	Hentai 0.9081 0.8892 0.8986 4180
	Normal-SFW 0.9010 0.8784 0.8896 5503
	Pornography 0.9489 0.9448 0.9469 5600
	Enticing or Sensual 0.8900 0.9436 0.9160 5600

	accuracy 0.9076 26483
	macro avg 0.9077 0.9065 0.9069 26483
	weighted avg 0.9077 0.9076 0.9074 26483
	```

	![2w](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/MdnxO4l5XgZx1tylvjEye.png)

	---

	## Label Space: 5 Classes

	\| Class ID \| Label \| Description \|
	\| -------- \| ------------------- \| ------------------------------------------------------------------------- \|
	\| 0 \| Anime-SFW \| Safe-for-work anime-style images. \|
	\| 1 \| Hentai \| Explicit or adult anime content. \|
	\| 2 \| Normal-SFW \| Realistic or photographic images that are safe for work. \|
	\| 3 \| Pornography \| Explicit adult content involving nudity or sexual acts. \|
	\| 4 \| Enticing or Sensual \| Suggestive imagery that is not explicit but intended to evoke sensuality. \|

	---

	> This model is experimental and may or may not be considered for actual use.

	## Install Dependencies

	```bash
	pip install -q transformers torch pillow gradio
	```

	---

	## Inference Code

	```python
	import gradio as gr
	from transformers import AutoImageProcessor, SiglipForImageClassification
	from PIL import Image
	import torch

	# Load model and processor
	model_name = "prithivMLmods/Image-Guard-2.0-Post0.1"
	model = SiglipForImageClassification.from_pretrained(model_name)
	processor = AutoImageProcessor.from_pretrained(model_name)

	# Label mapping
	id2label = {
	"0": "Anime-SFW",
	"1": "Hentai",
	"2": "Normal-SFW",
	"3": "Pornography",
	"4": "Enticing or Sensual"
	}

	def classify_image_safety(image):
	image = Image.fromarray(image).convert("RGB")
	inputs = processor(images=image, return_tensors="pt")

	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits
	probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()

	prediction = {
	id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
	}

	return prediction

	# Gradio Interface
	iface = gr.Interface(
	fn=classify_image_safety,
	inputs=gr.Image(type="numpy"),
	outputs=gr.Label(num_top_classes=5, label="Image Safety Classification"),
	title="Image-Guard-2.0-Post0.1",
	description="Upload an image to classify it into one of five safety categories: Anime-SFW, Hentai, Normal-SFW, Pornography, or Enticing/Sensual."
	)

	if __name__ == "__main__":
	iface.launch()
	```

	---

	## Intended Use

	Image-Guard-2.0-Post0.1 is designed for:

	* Content Moderation – Automatically identify and filter sensitive or NSFW imagery.
	* Dataset Curation – Separate clean and explicit data for research and training.
	* Platform Safety – Support compliance for social, educational, and media-sharing platforms.
	* AI Model Input Filtering – Prevent unsafe data from entering multimodal or generative pipelines.

	## Limitations

	* The model may occasionally misclassify borderline or artistically abstract images.
	* It does not perform face recognition or identify individuals.
	* Results depend on lighting, resolution, and visual context.
	* The model does not replace human moderation for sensitive environments.