ash12321
/

residual-autoencoder-ensemble

Model card Files Files and versions

xet

Community

ash12321 commited on Dec 26, 2025

Commit

8daf709

verified ·

1 Parent(s): 5013649

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +127 -0

README.md ADDED Viewed

	@@ -0,0 +1,127 @@

+# Residual Convolutional Autoencoder Ensemble
+Deep learning models for image reconstruction using residual convolutional autoencoders.
+## Model Architecture
+Two variants of a deep convolutional autoencoder with residual blocks:
+- **Model A**: latent_dim=512, dropout=0.15
+- **Model B**: latent_dim=768, dropout=0.20
+### Architecture Details
+```
+Input: (B, 3, 256, 256) RGB images in range [-1, 1]
+Encoder: 6-layer CNN with residual blocks (256→128→64→32→16→8→4)
+Latent: Fully connected projection to latent_dim
+Decoder: 6-layer TransposeCNN with residual blocks (4→8→16→32→64→128→256)
+Output: (B, 3, 256, 256) Reconstructed images + (B, latent_dim) latent codes
+```
+## Training Details
+- **Dataset**: Real images (256x256 resolution)
+- **Loss**: MSE (Mean Squared Error)
+- **Optimizer**: AdamW with weight decay
+- **Training**: 100+ epochs with validation monitoring
+- **Best Validation Loss**:
+  - Model A: 0.025486
+  - Model B: 0.025033
+## Usage
+```python
+import torch
+from model import ResidualConvAutoencoder, load_model
+# Option 1: Load pre-trained model
+model, checkpoint = load_model('model_a_best.pth', latent_dim=512, dropout=0.15)
+# Option 2: Create from scratch
+model = ResidualConvAutoencoder(latent_dim=512, dropout=0.15)
+model.eval()
+# Prepare image (normalize to [-1, 1])
+from torchvision import transforms
+transform = transforms.Compose([
+    transforms.Resize((256, 256)),
+    transforms.ToTensor(),
+    transforms.Lambda(lambda x: x * 2 - 1)  # [0,1] -> [-1,1]
+])
+# Inference
+with torch.no_grad():
+    img_tensor = transform(image).unsqueeze(0)
+    reconstructed, latent = model(img_tensor)
+    # Get reconstruction error
+    error = torch.nn.functional.mse_loss(reconstructed, img_tensor)
+```
+## Model Files
+- `model_a_best.pth` - Model A checkpoint (latent_dim=512)
+- `model_b_best.pth` - Model B checkpoint (latent_dim=768)
+- `model.py` - Model architecture definition
+- `config.json` - Training configuration
+- `training_history.json` - Full training metrics
+## Research Findings
+**Important Note**: These models were trained as image reconstruction autoencoders. Testing revealed they function as **enhancement/denoising models** rather than anomaly detectors:
+- ✅ Successfully reconstructs natural images
+- ✅ Can denoise corrupted images (JPEG artifacts, blur, contrast)
+- ⚠️ Not suitable for detecting modern AI-generated images
+- ⚠️ Shows negative discrimination for degraded images (reconstructs them better)
+### Performance on Synthetic Corruptions
+| Corruption Type | Separation from Real |
+|----------------|---------------------|
+| Noise Added | +122.1% ✅ |
+| Color Shifted | +23.8% ⚠️ |
+| Patch Corrupted | +12.6% ❌ |
+| JPEG Compressed | -9.8% ❌ |
+| Contrast Altered | -90.1% ❌ |
+| Blurred | -92.5% ❌ |
+Negative percentages indicate the model reconstructs corrupted images *better* than real images (denoising effect).
+## Limitations
+1. **Not an anomaly detector**: Models enhance/denoise rather than faithfully reconstruct
+2. **Poor for fake detection**: Cannot reliably distinguish modern AI-generated images from real ones
+3. **Pixel-space limitations**: Modern AI images are statistically similar to real images in pixel space
+## Recommended Use Cases
+✅ Image denoising and enhancement
+✅ Feature extraction (latent representations)
+✅ Image compression/reconstruction
+✅ Transfer learning backbone
+❌ Fake image detection (use supervised classifiers instead)
+❌ Anomaly detection (use different approach)
+## Citation
+If you use these models in your research, please cite:
+```
+@model{residual_autoencoder_ensemble_2024,
+  author = {ash12321},
+  title = {Residual Convolutional Autoencoder Ensemble},
+  year = {2024},
+  publisher = {Hugging Face},
+  howpublished = {\url{https://huggingface.co/ash12321/residual-autoencoder-ensemble}}
+}
+```
+## License
+MIT License - See LICENSE file for details
+## Contact
+For questions or issues, please open an issue on the Hugging Face model page.