File size: 6,995 Bytes

84a1c0c

---
tags:
- image-classification
- fake-detection
- anomaly-detection
- one-class-learning
- deepfake-detection
- computer-vision
license: mit
---

# 🎯 Fake Image Detection Ensemble (9 Models)

A powerful ensemble of 9 specialized models trained for detecting fake/AI-generated images using **single-class anomaly detection**. Trained only on real images to learn what "normal" looks like, then detects fakes as anomalies.

## 📊 Performance

| Metric | Score |
|--------|-------|
| **Accuracy** | 67.05% |
| **Precision** | 87.97% |
| **Recall** | 39.50% |
| **F1 Score** | 54.52% |

### Confusion Matrix
- True Negatives: 946 (real correctly identified)
- False Positives: 54 (real misclassified as fake)
- False Negatives: 605 (fake misclassified as real)
- True Positives: 395 (fake correctly identified)

## 🏗️ Architecture

The ensemble combines 9 specialized models using different detection strategies:

### Deep Learning Models (3):
1. **Enhanced Frequency VAE** - Multi-scale frequency analysis with phase information
   - Uses both magnitude and phase of FFT
   - Spectral consistency loss
   - Detects frequency-domain artifacts

2. **Edge Normalizing Flow** - Probability density estimation on edge features
   - Multi-scale edge analysis
   - Normalizing flow architecture
   - Detects unnatural edge patterns

3. **Semantic Deep SVDD** - ResNet50-based hypersphere anomaly detection
   - Semantic feature extraction
   - One-class deep learning
   - Detects high-level semantic anomalies

### Traditional ML Models (6):
4. **Texture One-Class SVM** - Boundary-based detection
   - Enhanced texture features
   - RBF kernel
   - Tight decision boundary (nu=0.03)

5. **Isolation Forest** - Isolation-based anomaly detection
   - 200 estimators
   - Frequency + spatial features
   - Fast inference

6. **Local Outlier Factor** - Local density anomalies
   - Multi-scale patch analysis
   - Novelty detection mode
   - 20 neighbors

7. **Gaussian Mixture Model** - Distribution modeling
   - 10 components
   - Full covariance
   - Color distribution analysis

8. **Color Distribution Model** - Statistical color analysis
   - RGB histograms
   - Mahalanobis distance
   - Color moment analysis

9. **Statistical Model** - Edge and color statistics
   - Sobel edge detection
   - Multi-scale analysis
   - Mahalanobis distance

## 🎓 Training Details

- **Training Data**: 30,000 real images from COCO dataset
- **Training Approach**: Single-class anomaly detection (NO fake images used)
- **Validation Split**: 20% (6,000 images)
- **Test Set**: 1,000 real + 1,000 fake images (completely separate)
- **Training Time**: ~5-6 hours on GPU
- **Ensemble Method**: Weighted voting with adaptive threshold

### Model Training Times (Extended):
- Enhanced Frequency VAE: 45 minutes
- Texture One-Class SVM: 45 minutes
- Color Distribution Model: 30 minutes
- Edge Normalizing Flow: 45 minutes
- Semantic Deep SVDD: 45 minutes
- Statistical Model: 30 minutes
- Isolation Forest: 30 minutes
- Local Outlier Factor: 35 minutes
- Gaussian Mixture Model: 30 minutes

## 🚀 Quick Start

```python
import torch
from torchvision import transforms
from PIL import Image
import pickle
import json
from huggingface_hub import hf_hub_download

# Configuration
repo_id = "ash12321/fake-image-detection-ensemble"
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Download and load config
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
with open(config_path, 'r') as f:
    config = json.load(f)

# Load models (you need the model class definitions)
# Example for one model:
vae_path = hf_hub_download(repo_id=repo_id, filename="freq_vae.pth")
# freq_vae = EnhancedFreqVAE()
# freq_vae.load_state_dict(torch.load(vae_path, map_location=device))
# freq_vae.to(device)

# Load all other models similarly...

# Predict on new image
img = Image.open('test_image.jpg')
img = img.resize((256, 256), Image.LANCZOS).convert('RGB')

tfm = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize([0.485,0.456,0.406], [0.229,0.224,0.225])
])
img_tensor = tfm(img)

# Get prediction from ensemble
is_fake, score, individual_scores = ensemble.predict(img_tensor, device)
print(f"Prediction: {'FAKE' if is_fake else 'REAL'}")
print(f"Anomaly Score: {score:.4f}")
print(f"Individual model scores: {individual_scores}")
```

## 📦 Model Files

| File | Description | Size |
|------|-------------|------|
| `freq_vae.pth` | Enhanced Frequency VAE weights | ~100 MB |
| `semantic_svdd.pth` | Semantic Deep SVDD weights | ~90 MB |
| `edge_flow.pth` | Edge Normalizing Flow weights | ~5 MB |
| `texture_ocsvm.pkl` | Texture One-Class SVM | ~200 MB |
| `iforest.pkl` | Isolation Forest | ~150 MB |
| `lof.pkl` | Local Outlier Factor | ~180 MB |
| `gmm.pkl` | Gaussian Mixture Model | ~50 MB |
| `color_model.pkl` | Color Distribution Model | ~10 MB |
| `stat.pkl` | Statistical Model | ~5 MB |
| `config.json` | Ensemble configuration | <1 MB |
| `results_summary.json` | Training metrics | <1 MB |

## 🔧 Requirements

```
torch>=2.0.0
torchvision>=0.15.0
numpy>=1.24.0
pillow>=9.0.0
scikit-learn>=1.3.0
scipy>=1.10.0
huggingface_hub>=0.19.0
```

## 🎯 Use Cases

- **Deepfake Detection**: Identify AI-generated faces
- **Image Forensics**: Detect manipulated images
- **Content Moderation**: Filter synthetic content
- **Research**: Study AI-generated image characteristics
- **Quality Control**: Verify image authenticity

## ⚠️ Limitations

- Trained on COCO real images - performance may vary on other domains
- Requires 256×256 input resolution
- May struggle with heavily compressed or low-quality images
- Performance depends on similarity between training and test distributions
- Not designed for adversarial attacks

## 📈 Model Improvements

This version includes several accuracy enhancements:

1. **Phase Information**: VAE uses both magnitude and phase of FFT
2. **Enhanced Features**: More comprehensive texture and edge features
3. **Adaptive Threshold**: Auto-calibrated at 95th percentile
4. **Optimized Weights**: Balanced ensemble voting
5. **Extended Training**: Up to 45 minutes per model for better convergence

## 📝 Citation

```bibtex
@misc{fake-detection-ensemble-2024,
  author = {ash12321},
  title = {Fake Image Detection Ensemble - 9 Model System},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/ash12321/fake-image-detection-ensemble}}
}
```

## 📄 License

MIT License - Free for research and commercial use

## 🙏 Acknowledgments

- COCO Dataset for training data
- PyTorch and scikit-learn communities
- Hugging Face for model hosting

## 📞 Contact

Questions? Issues? Open an issue or discussion on this repository!

---

**Note**: This model was trained using single-class learning, making it robust to new types of fake images not seen during training. The ensemble approach combines multiple detection strategies for maximum accuracy and reliability.