--- license: mit tags: - computer-vision - anomaly-detection - deep-svdd - ai-generated-images - image-classification - pytorch-lightning datasets: - cifar10 library_name: pytorch-lightning pipeline_tag: image-classification --- # 🔍 AI Image Detector - Deep SVDD
**One-Class Deep Learning Model for Detecting AI-Generated Images** [![Model](https://img.shields.io/badge/Model-Deep%20SVDD-blue)](https://huggingface.co/ash12321/ai-image-detector-deepsvdd) [![Framework](https://img.shields.io/badge/Framework-PyTorch%20Lightning-red)](https://lightning.ai/) [![Dataset](https://img.shields.io/badge/Dataset-CIFAR--10-green)](https://www.cs.toronto.edu/~kriz/cifar.html)
## 📖 Model Description This model detects AI-generated images using **Deep Support Vector Data Description (SVDD)**, a one-class learning approach. It was trained exclusively on real images to learn what "real" looks like, allowing it to identify synthetic/AI-generated images as anomalies. ### Key Features - ✅ **Enhanced Deep SVDD Architecture** with channel attention mechanisms - ✅ **Trained on 35,000 real images** from CIFAR-10 dataset - ✅ **L4 GPU Optimized** with mixed precision training (16-bit) - ✅ **Advanced Augmentation**: Mixup, multi-scale, contrastive learning - ✅ **Robust Evaluation**: 70/15/15 train/val/test split with unseen test data ## 🎯 Performance Metrics | Metric | Value | |--------|-------| | **Test Loss** | 0.7637 | | **Mean Distance** | 0.7637 | | **Std Distance** | 0.0024 | | **95th Percentile** | 0.7700 | | **Radius Threshold** | 0.7747 | ## 🚀 Quick Start ### Installation ```bash pip install torch torchvision pytorch-lightning huggingface-hub pillow ``` ### Basic Usage ```python import torch from huggingface_hub import hf_hub_download from PIL import Image import torchvision.transforms as transforms # Download model model_path = hf_hub_download( repo_id="ash12321/ai-image-detector-deepsvdd", filename="model.ckpt" ) # Load model (you'll need the model class definition) from model import AdvancedDeepSVDD model = AdvancedDeepSVDD.load_from_checkpoint(model_path) model.eval() # Prepare image transform = transforms.Compose([ transforms.Resize((32, 32)), transforms.ToTensor(), transforms.Normalize( mean=[0.4914, 0.4822, 0.4465], std=[0.2470, 0.2435, 0.2616] ) ]) image = Image.open('test_image.jpg').convert('RGB') image_tensor = transform(image).unsqueeze(0) # Predict is_fake, scores, distances = model.predict_anomaly(image_tensor) print(f"AI-Generated: {is_fake[0].item()}") print(f"Confidence: {scores[0].item()*100:.1f}%") print(f"Anomaly Score: {scores[0].item():.4f}") ``` ### Using with Gradio ```python import gradio as gr def predict(image): img_tensor = transform(image).unsqueeze(0) is_fake, scores, _ = model.predict_anomaly(img_tensor) result = "🚨 AI-Generated" if is_fake[0] else "✅ Real Image" confidence = f"{scores[0].item()*100:.1f}%" return f"**{result}** (Confidence: {confidence})" demo = gr.Interface( fn=predict, inputs=gr.Image(type="pil"), outputs=gr.Markdown(), title="AI Image Detector" ) demo.launch() ``` ## 🏗️ Architecture Details ### Enhanced Deep SVDD Encoder ``` Input (3x32x32) → Stem Conv (64 channels) → Layer1 (64→128) + Channel Attention → Layer2 (128→256) + Channel Attention → Layer3 (256→512) + Channel Attention → Dual Pooling (Avg + Max) → Projection Head (1024→512→128) → Output (128-dim latent space) ``` ### Training Optimizations - **Optimizer**: AdamW (lr=1e-3, weight_decay=1e-3) - **Scheduler**: OneCycleLR with cosine annealing - **Batch Size**: 128 (L4 GPU optimized) - **Augmentation**: Mixup (α=0.2), multi-scale, extensive transforms - **Loss**: SVDD objective + contrastive diversity + L2 regularization ## 📊 Training Configuration ```python Model Parameters: 5.3M trainable Epochs: 30 Training Samples: 35,000 (70%) Validation Samples: 7,500 (15%) Test Samples: 7,500 (15%) Precision: 16-bit mixed precision GPU: NVIDIA L4 with Tensor Cores ``` ## 🎨 Data Augmentation Pipeline **Training Augmentations:** - Multi-scale resizing (32, 64, 96 pixels) - Random resized crop (scale: 0.5-1.0) - Random horizontal/vertical flips - Random rotation (±20°) - Color jitter (brightness, contrast, saturation, hue) - Gaussian blur - Random erasing - Mixup augmentation **Validation/Test:** - Simple resize to 32x32 - Normalize with CIFAR-10 statistics ## 💡 Use Cases - **Content Moderation**: Identify AI-generated images in uploads - **Digital Forensics**: Verify authenticity of images - **Research**: Study differences between real and synthetic images - **Education**: Demonstrate one-class learning techniques ## ⚠️ Limitations - **Training Domain**: Optimized for natural images similar to CIFAR-10 - **Image Size**: Trained on 32x32 images (resize larger images) - **Generalization**: May require fine-tuning for specific domains - **False Positives**: Unusual real images may be flagged as AI-generated - **Not Foolproof**: Sophisticated AI images may evade detection ## 📚 Citation If you use this model in your research, please cite: ```bibtex @misc{ai-image-detector-deepsvdd-2024, author = {ash12321}, title = {AI Image Detector using Deep SVDD}, year = {2024}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/ash12321/ai-image-detector-deepsvdd}}, } ``` ## 📄 License This model is released under the MIT License. ## 🤝 Contributing Contributions, issues, and feature requests are welcome! ## 👤 Author **ash12321** - Hugging Face: [@ash12321](https://huggingface.co/ash12321) ## 🙏 Acknowledgments - CIFAR-10 dataset creators - PyTorch Lightning team - Deep SVDD paper authors - Hugging Face for hosting infrastructure ---
**[Try it on Hugging Face Spaces](https://huggingface.co/spaces/ash12321/ai-image-detector-demo)**