ash12321's picture
Upload README.md with huggingface_hub
84a1c0c verified
---
tags:
- image-classification
- fake-detection
- anomaly-detection
- one-class-learning
- deepfake-detection
- computer-vision
license: mit
---
# 🎯 Fake Image Detection Ensemble (9 Models)
A powerful ensemble of 9 specialized models trained for detecting fake/AI-generated images using **single-class anomaly detection**. Trained only on real images to learn what "normal" looks like, then detects fakes as anomalies.
## πŸ“Š Performance
| Metric | Score |
|--------|-------|
| **Accuracy** | 67.05% |
| **Precision** | 87.97% |
| **Recall** | 39.50% |
| **F1 Score** | 54.52% |
### Confusion Matrix
- True Negatives: 946 (real correctly identified)
- False Positives: 54 (real misclassified as fake)
- False Negatives: 605 (fake misclassified as real)
- True Positives: 395 (fake correctly identified)
## πŸ—οΈ Architecture
The ensemble combines 9 specialized models using different detection strategies:
### Deep Learning Models (3):
1. **Enhanced Frequency VAE** - Multi-scale frequency analysis with phase information
- Uses both magnitude and phase of FFT
- Spectral consistency loss
- Detects frequency-domain artifacts
2. **Edge Normalizing Flow** - Probability density estimation on edge features
- Multi-scale edge analysis
- Normalizing flow architecture
- Detects unnatural edge patterns
3. **Semantic Deep SVDD** - ResNet50-based hypersphere anomaly detection
- Semantic feature extraction
- One-class deep learning
- Detects high-level semantic anomalies
### Traditional ML Models (6):
4. **Texture One-Class SVM** - Boundary-based detection
- Enhanced texture features
- RBF kernel
- Tight decision boundary (nu=0.03)
5. **Isolation Forest** - Isolation-based anomaly detection
- 200 estimators
- Frequency + spatial features
- Fast inference
6. **Local Outlier Factor** - Local density anomalies
- Multi-scale patch analysis
- Novelty detection mode
- 20 neighbors
7. **Gaussian Mixture Model** - Distribution modeling
- 10 components
- Full covariance
- Color distribution analysis
8. **Color Distribution Model** - Statistical color analysis
- RGB histograms
- Mahalanobis distance
- Color moment analysis
9. **Statistical Model** - Edge and color statistics
- Sobel edge detection
- Multi-scale analysis
- Mahalanobis distance
## πŸŽ“ Training Details
- **Training Data**: 30,000 real images from COCO dataset
- **Training Approach**: Single-class anomaly detection (NO fake images used)
- **Validation Split**: 20% (6,000 images)
- **Test Set**: 1,000 real + 1,000 fake images (completely separate)
- **Training Time**: ~5-6 hours on GPU
- **Ensemble Method**: Weighted voting with adaptive threshold
### Model Training Times (Extended):
- Enhanced Frequency VAE: 45 minutes
- Texture One-Class SVM: 45 minutes
- Color Distribution Model: 30 minutes
- Edge Normalizing Flow: 45 minutes
- Semantic Deep SVDD: 45 minutes
- Statistical Model: 30 minutes
- Isolation Forest: 30 minutes
- Local Outlier Factor: 35 minutes
- Gaussian Mixture Model: 30 minutes
## πŸš€ Quick Start
```python
import torch
from torchvision import transforms
from PIL import Image
import pickle
import json
from huggingface_hub import hf_hub_download
# Configuration
repo_id = "ash12321/fake-image-detection-ensemble"
device = 'cuda' if torch.cuda.is_available() else 'cpu'
# Download and load config
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
with open(config_path, 'r') as f:
config = json.load(f)
# Load models (you need the model class definitions)
# Example for one model:
vae_path = hf_hub_download(repo_id=repo_id, filename="freq_vae.pth")
# freq_vae = EnhancedFreqVAE()
# freq_vae.load_state_dict(torch.load(vae_path, map_location=device))
# freq_vae.to(device)
# Load all other models similarly...
# Predict on new image
img = Image.open('test_image.jpg')
img = img.resize((256, 256), Image.LANCZOS).convert('RGB')
tfm = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize([0.485,0.456,0.406], [0.229,0.224,0.225])
])
img_tensor = tfm(img)
# Get prediction from ensemble
is_fake, score, individual_scores = ensemble.predict(img_tensor, device)
print(f"Prediction: {'FAKE' if is_fake else 'REAL'}")
print(f"Anomaly Score: {score:.4f}")
print(f"Individual model scores: {individual_scores}")
```
## πŸ“¦ Model Files
| File | Description | Size |
|------|-------------|------|
| `freq_vae.pth` | Enhanced Frequency VAE weights | ~100 MB |
| `semantic_svdd.pth` | Semantic Deep SVDD weights | ~90 MB |
| `edge_flow.pth` | Edge Normalizing Flow weights | ~5 MB |
| `texture_ocsvm.pkl` | Texture One-Class SVM | ~200 MB |
| `iforest.pkl` | Isolation Forest | ~150 MB |
| `lof.pkl` | Local Outlier Factor | ~180 MB |
| `gmm.pkl` | Gaussian Mixture Model | ~50 MB |
| `color_model.pkl` | Color Distribution Model | ~10 MB |
| `stat.pkl` | Statistical Model | ~5 MB |
| `config.json` | Ensemble configuration | <1 MB |
| `results_summary.json` | Training metrics | <1 MB |
## πŸ”§ Requirements
```
torch>=2.0.0
torchvision>=0.15.0
numpy>=1.24.0
pillow>=9.0.0
scikit-learn>=1.3.0
scipy>=1.10.0
huggingface_hub>=0.19.0
```
## 🎯 Use Cases
- **Deepfake Detection**: Identify AI-generated faces
- **Image Forensics**: Detect manipulated images
- **Content Moderation**: Filter synthetic content
- **Research**: Study AI-generated image characteristics
- **Quality Control**: Verify image authenticity
## ⚠️ Limitations
- Trained on COCO real images - performance may vary on other domains
- Requires 256Γ—256 input resolution
- May struggle with heavily compressed or low-quality images
- Performance depends on similarity between training and test distributions
- Not designed for adversarial attacks
## πŸ“ˆ Model Improvements
This version includes several accuracy enhancements:
1. **Phase Information**: VAE uses both magnitude and phase of FFT
2. **Enhanced Features**: More comprehensive texture and edge features
3. **Adaptive Threshold**: Auto-calibrated at 95th percentile
4. **Optimized Weights**: Balanced ensemble voting
5. **Extended Training**: Up to 45 minutes per model for better convergence
## πŸ“ Citation
```bibtex
@misc{fake-detection-ensemble-2024,
author = {ash12321},
title = {Fake Image Detection Ensemble - 9 Model System},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/ash12321/fake-image-detection-ensemble}}
}
```
## πŸ“„ License
MIT License - Free for research and commercial use
## πŸ™ Acknowledgments
- COCO Dataset for training data
- PyTorch and scikit-learn communities
- Hugging Face for model hosting
## πŸ“ž Contact
Questions? Issues? Open an issue or discussion on this repository!
---
**Note**: This model was trained using single-class learning, making it robust to new types of fake images not seen during training. The ensemble approach combines multiple detection strategies for maximum accuracy and reliability.