ash12321's picture
Upload README.md with huggingface_hub
84a1c0c verified
metadata
tags:
  - image-classification
  - fake-detection
  - anomaly-detection
  - one-class-learning
  - deepfake-detection
  - computer-vision
license: mit

🎯 Fake Image Detection Ensemble (9 Models)

A powerful ensemble of 9 specialized models trained for detecting fake/AI-generated images using single-class anomaly detection. Trained only on real images to learn what "normal" looks like, then detects fakes as anomalies.

πŸ“Š Performance

Metric Score
Accuracy 67.05%
Precision 87.97%
Recall 39.50%
F1 Score 54.52%

Confusion Matrix

  • True Negatives: 946 (real correctly identified)
  • False Positives: 54 (real misclassified as fake)
  • False Negatives: 605 (fake misclassified as real)
  • True Positives: 395 (fake correctly identified)

πŸ—οΈ Architecture

The ensemble combines 9 specialized models using different detection strategies:

Deep Learning Models (3):

  1. Enhanced Frequency VAE - Multi-scale frequency analysis with phase information

    • Uses both magnitude and phase of FFT
    • Spectral consistency loss
    • Detects frequency-domain artifacts
  2. Edge Normalizing Flow - Probability density estimation on edge features

    • Multi-scale edge analysis
    • Normalizing flow architecture
    • Detects unnatural edge patterns
  3. Semantic Deep SVDD - ResNet50-based hypersphere anomaly detection

    • Semantic feature extraction
    • One-class deep learning
    • Detects high-level semantic anomalies

Traditional ML Models (6):

  1. Texture One-Class SVM - Boundary-based detection

    • Enhanced texture features
    • RBF kernel
    • Tight decision boundary (nu=0.03)
  2. Isolation Forest - Isolation-based anomaly detection

    • 200 estimators
    • Frequency + spatial features
    • Fast inference
  3. Local Outlier Factor - Local density anomalies

    • Multi-scale patch analysis
    • Novelty detection mode
    • 20 neighbors
  4. Gaussian Mixture Model - Distribution modeling

    • 10 components
    • Full covariance
    • Color distribution analysis
  5. Color Distribution Model - Statistical color analysis

    • RGB histograms
    • Mahalanobis distance
    • Color moment analysis
  6. Statistical Model - Edge and color statistics

    • Sobel edge detection
    • Multi-scale analysis
    • Mahalanobis distance

πŸŽ“ Training Details

  • Training Data: 30,000 real images from COCO dataset
  • Training Approach: Single-class anomaly detection (NO fake images used)
  • Validation Split: 20% (6,000 images)
  • Test Set: 1,000 real + 1,000 fake images (completely separate)
  • Training Time: ~5-6 hours on GPU
  • Ensemble Method: Weighted voting with adaptive threshold

Model Training Times (Extended):

  • Enhanced Frequency VAE: 45 minutes
  • Texture One-Class SVM: 45 minutes
  • Color Distribution Model: 30 minutes
  • Edge Normalizing Flow: 45 minutes
  • Semantic Deep SVDD: 45 minutes
  • Statistical Model: 30 minutes
  • Isolation Forest: 30 minutes
  • Local Outlier Factor: 35 minutes
  • Gaussian Mixture Model: 30 minutes

πŸš€ Quick Start

import torch
from torchvision import transforms
from PIL import Image
import pickle
import json
from huggingface_hub import hf_hub_download

# Configuration
repo_id = "ash12321/fake-image-detection-ensemble"
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Download and load config
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
with open(config_path, 'r') as f:
    config = json.load(f)

# Load models (you need the model class definitions)
# Example for one model:
vae_path = hf_hub_download(repo_id=repo_id, filename="freq_vae.pth")
# freq_vae = EnhancedFreqVAE()
# freq_vae.load_state_dict(torch.load(vae_path, map_location=device))
# freq_vae.to(device)

# Load all other models similarly...

# Predict on new image
img = Image.open('test_image.jpg')
img = img.resize((256, 256), Image.LANCZOS).convert('RGB')

tfm = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize([0.485,0.456,0.406], [0.229,0.224,0.225])
])
img_tensor = tfm(img)

# Get prediction from ensemble
is_fake, score, individual_scores = ensemble.predict(img_tensor, device)
print(f"Prediction: {'FAKE' if is_fake else 'REAL'}")
print(f"Anomaly Score: {score:.4f}")
print(f"Individual model scores: {individual_scores}")

πŸ“¦ Model Files

File Description Size
freq_vae.pth Enhanced Frequency VAE weights ~100 MB
semantic_svdd.pth Semantic Deep SVDD weights ~90 MB
edge_flow.pth Edge Normalizing Flow weights ~5 MB
texture_ocsvm.pkl Texture One-Class SVM ~200 MB
iforest.pkl Isolation Forest ~150 MB
lof.pkl Local Outlier Factor ~180 MB
gmm.pkl Gaussian Mixture Model ~50 MB
color_model.pkl Color Distribution Model ~10 MB
stat.pkl Statistical Model ~5 MB
config.json Ensemble configuration <1 MB
results_summary.json Training metrics <1 MB

πŸ”§ Requirements

torch>=2.0.0
torchvision>=0.15.0
numpy>=1.24.0
pillow>=9.0.0
scikit-learn>=1.3.0
scipy>=1.10.0
huggingface_hub>=0.19.0

🎯 Use Cases

  • Deepfake Detection: Identify AI-generated faces
  • Image Forensics: Detect manipulated images
  • Content Moderation: Filter synthetic content
  • Research: Study AI-generated image characteristics
  • Quality Control: Verify image authenticity

⚠️ Limitations

  • Trained on COCO real images - performance may vary on other domains
  • Requires 256Γ—256 input resolution
  • May struggle with heavily compressed or low-quality images
  • Performance depends on similarity between training and test distributions
  • Not designed for adversarial attacks

πŸ“ˆ Model Improvements

This version includes several accuracy enhancements:

  1. Phase Information: VAE uses both magnitude and phase of FFT
  2. Enhanced Features: More comprehensive texture and edge features
  3. Adaptive Threshold: Auto-calibrated at 95th percentile
  4. Optimized Weights: Balanced ensemble voting
  5. Extended Training: Up to 45 minutes per model for better convergence

πŸ“ Citation

@misc{fake-detection-ensemble-2024,
  author = {ash12321},
  title = {Fake Image Detection Ensemble - 9 Model System},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/ash12321/fake-image-detection-ensemble}}
}

πŸ“„ License

MIT License - Free for research and commercial use

πŸ™ Acknowledgments

  • COCO Dataset for training data
  • PyTorch and scikit-learn communities
  • Hugging Face for model hosting

πŸ“ž Contact

Questions? Issues? Open an issue or discussion on this repository!


Note: This model was trained using single-class learning, making it robust to new types of fake images not seen during training. The ensemble approach combines multiple detection strategies for maximum accuracy and reliability.