ItsNotAI v2 - Dual-Head AI Image Detector

Detect AI-generated images | Identify the AI generator | Verify human-made artwork

A Vision Transformer model with dual-head architecture that detects AI-generated images and identifies the specific AI generator used.

Website: https://itsnotai.org

Note: This is one of the models used by ItsNotAI. For official verification at itsnotai.org, we use an ensemble of multiple models combined with human expert review to ensure maximum accuracy.

Model Versions

Version	Architecture	Best For
v2 (Latest)	Dual-head	Real/AI detection + Source ID
v1	Single-head	Source identification only

What's New in v2:

Dual-head architecture: dedicated binary classifier for Real/AI detection
FLUX image detection support
Improved Midjourney detection
Enhanced real photo recognition with class weights

Performance

Binary Classification (Real vs AI)

Metric	Value
Accuracy	95.07%
Precision	96.37%
Recall	96.39%
F1 Score	96.38%
AUC	0.990

Multi-class Classification (Source Identification)

Metric	Value
Accuracy	93.47%
Precision	93.39%
Recall	93.47%
F1 Score	93.40%
AUC	0.994

Quick Start

Installation

pip install transformers torch pillow huggingface_hub

Recommended Usage (Dual-Head)

import torch
import torch.nn as nn
from transformers import AutoModelForImageClassification, AutoImageProcessor
from huggingface_hub import hf_hub_download
from PIL import Image
import json

# Load model
model_id = "boluobobo/ItsNotAI-ai-detector-v2"
model = AutoModelForImageClassification.from_pretrained(model_id)
processor = AutoImageProcessor.from_pretrained(model_id)
model.eval()

# Load metadata and binary head
meta_path = hf_hub_download(repo_id=model_id, filename="source_meta.json")
with open(meta_path) as f:
    meta = json.load(f)

source_names = meta["source_names"]
source_is_real = meta["source_is_real"]
hidden_size = meta.get("hidden_size", model.config.hidden_size)

# Load binary classification head
binary_head_path = hf_hub_download(repo_id=model_id, filename="binary_head.pt")
binary_head = nn.Sequential(nn.Dropout(0.1), nn.Linear(hidden_size, 2))
binary_head.load_state_dict(torch.load(binary_head_path, map_location="cpu"))
binary_head.eval()

def get_backbone_features(pixel_values):
    """Extract CLS token features from backbone"""
    if hasattr(model, 'beit'):
        outputs = model.beit(pixel_values)
    elif hasattr(model, 'vit'):
        outputs = model.vit(pixel_values)
    return outputs.last_hidden_state[:, 0]

def detect_image(image_path):
    image = Image.open(image_path).convert("RGB")
    inputs = processor(image, return_tensors="pt")

    with torch.no_grad():
        # Multi-class prediction (source identification)
        outputs = model(**inputs)
        probs = torch.softmax(outputs.logits, dim=-1)[0]

        # Binary prediction (Real vs AI)
        features = get_backbone_features(inputs["pixel_values"])
        binary_logits = binary_head(features)
        binary_probs = torch.softmax(binary_logits, dim=-1)[0]

        human_prob = binary_probs[0].item()  # index 0 = Real
        ai_prob = binary_probs[1].item()     # index 1 = AI

    # Get predicted source
    pred_idx = probs.argmax().item()
    predicted_source = source_names[pred_idx]

    # Get top 3 AI sources
    ai_sources = []
    for i, name in enumerate(source_names):
        if not source_is_real.get(name, False):
            ai_sources.append({"label": name, "score": round(probs[i].item(), 3)})
    ai_sources.sort(key=lambda x: x["score"], reverse=True)

    return {
        "ai_probability": round(ai_prob, 3),
        "human_probability": round(human_prob, 3),
        "predicted_source": predicted_source,
        "is_real": human_prob > ai_prob,
        "top3_sources": ai_sources[:3]
    }

# Example
result = detect_image("test.jpg")
print(f"AI Probability: {result['ai_probability']:.1%}")
print(f"Human Probability: {result['human_probability']:.1%}")
print(f"Verdict: {'Real' if result['is_real'] else 'AI Generated'}")
print(f"Predicted Source: {result['predicted_source']}")

Example Output:

{
  "ai_probability": 0.892,
  "human_probability": 0.108,
  "predicted_source": "midjourney",
  "is_real": false,
  "top3_sources": [
    {"label": "midjourney", "score": 0.756},
    {"label": "stable_diffusion", "score": 0.089},
    {"label": "flux", "score": 0.042}
  ]
}

Basic Usage (Single-Head Fallback)

from transformers import AutoModelForImageClassification, AutoImageProcessor
from PIL import Image
import torch

# Load model
model_id = "boluobobo/ItsNotAI-ai-detector-v2"
model = AutoModelForImageClassification.from_pretrained(model_id)
processor = AutoImageProcessor.from_pretrained(model_id)

# Real source labels
REAL_LABELS = {"afhq", "celebahq", "coco", "ffhq", "imagenet", "landscape", "lsun", "metfaces"}

# Load and predict
image = Image.open("your_image.jpg").convert("RGB")
inputs = processor(image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)[0]

pred_idx = probs.argmax().item()
label = model.config.id2label[str(pred_idx)]
confidence = probs[pred_idx].item()
is_real = label in REAL_LABELS

print(f"Prediction: {label}")
print(f"Confidence: {confidence:.2%}")
print(f"Is Real: {is_real}")

Model Description

Architecture

Base Model: microsoft/beit-large-patch16-224 (BEiT-Large)
Parameters: ~304M
Input Size: 224x224 pixels
Mode: Dual-head (Multi-class + Binary classification)

Dual-Head Design

Input Image
     │
     ▼
┌─────────────┐
│  BEiT-Large │ (Backbone)
│   Encoder   │
└──────┬──────┘
       │
   CLS Token
       │
       ├────────────────┬────────────────┐
       ▼                ▼                │
┌─────────────┐  ┌─────────────┐         │
│ Multi-class │  │   Binary    │         │
│    Head     │  │    Head     │         │
│ (33 classes)│  │ (Real/AI)   │         │
└─────────────┘  └─────────────┘         │
       │                │                │
       ▼                ▼                │
 Source ID         Real vs AI           │
 (e.g., SD,        Probability          │
  Midjourney)                           │

Why Dual-Head?

Binary head provides more accurate Real/AI classification
Multi-class head identifies the specific AI generator
Best of both worlds: accurate detection + source identification

Output Labels (33 Classes)

Real Sources (8):

afhq, celebahq, coco, ffhq, imagenet, landscape, lsun, metfaces

AI Sources (25):

Category	Models
Diffusion	stable_diffusion, latent_diffusion, ddpm, vq_diffusion, palette, diffusion_gan, denoising_diffusion_gan, flux
GAN	stylegan1, stylegan2, stylegan3, pro_gan, big_gan, cycle_gan, star_gan, gansformer, projected_gan
Commercial	midjourney, dalle, glide
Other	gau_gan, taming_transformer, generative_inpainting, lama, mat, cips, face_synthetics, sfhq

About ItsNotAI

Most AI detectors focus on catching AI usage. ItsNotAI takes the opposite approach: helping artists prove their work is human-made.

Key Features

Verifiable Label: Beyond just a percentage score, we provide artists with a verifiable "Not AI" label that can be embedded in their work.
Industry-Focused: We specialize in digital painting, manga illustration, and texture design, developed in deep collaboration with 100+ professional artists.
Artist-First: Our industry endorsements and artist partnerships create a trust network that goes beyond pure technical metrics.

Use Cases

Artists & Creators: Prove your artwork is human-made, protect your reputation
Stock Photo Platforms: Filter AI-generated uploads, maintain content quality
Social Media Moderation: Detect AI-generated profile pictures and fake content
News & Media: Verify photo authenticity, combat misinformation
NFT Marketplaces: Ensure digital art authenticity
Academic Research: Study AI image generation patterns

Training Details

Base: Fine-tuned from ItsNotAI-ai-detector-v1
Dataset: ArtiFact + FLUX + Midjourney + Real photos (~50K+ images)
Architecture: Dual-head (multi-class + binary classification)
Epochs: 10
Batch Size: 64
Learning Rate: 5e-6
Optimizer: AdamW with cosine scheduler
Loss: Focal Loss with label smoothing (0.1)
Binary Class Weights: [1.5, 1.0] (boost real photo recognition)
Hardware: NVIDIA A100 GPU

v1 vs v2 Comparison

Feature	v1	v2
Architecture	Single-head	Dual-head
FLUX Detection	No	Yes
Midjourney Enhanced	Basic	Improved
Binary Classification	Derived from top-1	Dedicated head

Files in This Repository

File	Description
`config.json`	Model configuration
`model.safetensors`	Model weights
`preprocessor_config.json`	Image processor config
`source_meta.json`	Source names and metadata
`binary_head.pt`	Binary classification head weights
`handler.py`	Custom inference handler

API / Inference Endpoint

Deploy as a Hugging Face Inference Endpoint for production use:

import requests
import base64

API_URL = "https://api-inference.huggingface.co/models/boluobobo/ItsNotAI-ai-detector-v2"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

with open("image.jpg", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode()

response = requests.post(API_URL, headers=headers, json={"inputs": image_base64})
print(response.json())

FAQ

Q: What's the difference between v1 and v2? A: v2 adds a dedicated binary classification head for improved Real/AI detection, plus enhanced FLUX and Midjourney detection.

Q: Can this detect FLUX images? A: Yes! v2 is specifically trained on FLUX-generated images.

Q: Can this detect Midjourney images? A: Yes, with improved detection compared to v1.

Q: Should I use the binary head or multi-class head? A: Use the binary head for Real/AI classification (more accurate), and multi-class head if you need to identify the specific AI generator.

Q: What image formats are supported? A: PNG, JPG, WEBP, and other common formats. Images are automatically resized to 224x224 for processing.

Limitations

Best performance on 224x224 or larger images
May have reduced accuracy on heavily compressed images
Trained primarily on Western-style images
New AI generators not in training data may not be correctly identified
FLUX detection is based on limited training samples

Citation

@misc{itsnotai2025v2,
  title={ItsNotAI v2: Dual-Head AI Image Detection},
  author={ItsNotAI Team},
  year={2025},
  url={https://huggingface.co/boluobobo/ItsNotAI-ai-detector-v2}
}

License

Apache 2.0

Model tree for boluobobo/ItsNotAI-ai-detector-v2

Base model

microsoft/beit-large-patch16-224

Finetuned

(43)

this model

Spaces using boluobobo/ItsNotAI-ai-detector-v2 2

Evaluation results

Accuracy
self-reported

0.935
F1 Score
self-reported

0.934
Precision
self-reported

0.934
Recall
self-reported

0.935
Binary Accuracy
self-reported

0.951
Binary F1 Score
self-reported

0.964
Binary Precision
self-reported

0.964
Binary Recall
self-reported

0.964

boluobobo
/

ItsNotAI-ai-detector-v2