| --- |
| title: XAI Image Classifier |
| emoji: ๐ฌ |
| colorFrom: blue |
| colorTo: purple |
| sdk: gradio |
| sdk_version: 5.49.1 |
| app_file: app.py |
| pinned: false |
| license: mit |
| tags: |
| - computer-vision |
| - image-classification |
| - explainable-ai |
| - grad-cam |
| - resnet |
| - pytorch |
| - interpretability |
| --- |
| |
| # ๐ฌ XAI Image Classifier: ResNet-152 with Grad-CAM |
|
|
| [](https://pytorch.org/) |
| [](https://gradio.app) |
| [](LICENSE) |
|
|
| > **Production-grade explainable image classification** powered by ResNet-152 architecture with gradient-based visual attribution via Grad-CAM. |
|
|
| ## ๐ฏ Overview |
|
|
| This space provides **transparent AI decision-making** for image classification tasks. Built on ResNet-152 (82.3% ImageNet Top-1 accuracy), it integrates Captum's LayerGradCam to generate pixel-level attribution maps, revealing which spatial regions drive class-specific predictions. |
|
|
| ## โจ Key Features |
|
|
| | Feature | Description | |
| |---------|-------------| |
| | **๐ง ResNet-152 Architecture** | 60M parameters, 82.3% ImageNet accuracy | |
| | **๐ฅ Grad-CAM Visualization** | Gradient-weighted class activation mapping | |
| | **โก GPU-Optimized Inference** | FP16 mixed-precision (~4-5ms latency on A100) | |
| | **๐ Multi-View Analysis** | Original + Heatmap + Overlay + Contours | |
| | **๐จ 1000 ImageNet Classes** | Comprehensive object recognition | |
|
|
| ## ๐ How to Use |
|
|
| 1. **Upload an image** (JPG, PNG, WebP supported) |
| 2. Click **"๐ Analyze"** to run inference |
| 3. View **Top-10 predictions** with confidence scores |
| 4. Examine **Grad-CAM heatmaps** showing model attention |
| 5. Compare **multiple colormap visualizations** |
|
|
| ## ๐ฌ Technical Architecture |
| ```python |
| Model: ResNet-152 (torchvision.models.resnet152) |
| Weights: IMAGENET1K_V2 (pretrained) |
| XAI Method: Layer Grad-CAM (Captum) |
| Target Layer: layer4[-1] (final conv block) |
| Input Size: 224ร224 RGB |
| Precision: FP16 (GPU) / FP32 (CPU) |
| ``` |
|
|
| ### Performance Metrics |
|
|
| | Hardware | Inference Time | Memory Usage | |
| |----------|---------------|--------------| |
| | NVIDIA A100 | ~3-4ms | 1.2GB | |
| | NVIDIA T4 | ~8-10ms | 1.2GB | |
| | CPU (16 cores) | ~200ms | 2.5GB | |
|
|
| ## ๐ Model Accuracy |
|
|
| - **Top-1 Accuracy:** 82.3% (ImageNet validation set) |
| - **Top-5 Accuracy:** 96.1% |
| - **Parameter Count:** 60.2M |
| - **FLOPs:** 11.6B |
|
|
| ## ๐ ๏ธ Optimizations Applied |
|
|
| - **FP16 Mixed Precision:** 2x inference speedup on GPU |
| - **cuDNN Benchmark:** Auto-tuned convolution algorithms |
| - **TF32 Operations:** 8x faster matmuls on Ampere GPUs |
| - **Gradient Checkpointing:** Memory-efficient Grad-CAM computation |
|
|
| ## ๐จ Visualization Outputs |
|
|
| 1. **Original Image** - Input as-is |
| 2. **Grad-CAM Heatmap** - Pure activation visualization |
| 3. **Overlay** - Heatmap superimposed on original |
| 4. **Multi-Colormap Comparison** - Jet, Hot, Viridis with contours |
|
|
| ## ๐ Use Cases |
|
|
| | Domain | Application | |
| |--------|-------------| |
| | **Medical Imaging** | Validate diagnostic AI attention regions | |
| | **Autonomous Systems** | Debug object detection focus | |
| | **Security & Surveillance** | Audit algorithmic decision-making | |
| | **Research** | Study CNN feature representations | |
| | **Education** | Teach explainable AI concepts | |
|
|
| ## ๐ Privacy & Ethics |
|
|
| - โ
**No data retention** - Images processed in-memory only |
| - โ
**Zero telemetry** - No usage tracking |
| - โ
**Open source** - Full code transparency |
| - โ
**Bias auditing** - Visual inspection of model biases |
|
|
| ## ๐ References |
|
|
| ### Model Architecture |
| - He, K., et al. (2016). *Deep Residual Learning for Image Recognition.* CVPR. |
|
|
| ### Explainability Method |
| - Selvaraju, R. R., et al. (2017). *Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization.* ICCV. |
|
|
| ### Framework |
| - PyTorch Team. *PyTorch: An Imperative Style, High-Performance Deep Learning Library.* NeurIPS 2019. |
|
|
| ## ๐ Links |
|
|
| - **GitHub Repository:** [0AnshuAditya0/xai](https://github.com/0AnshuAditya0/xai) |
| - **Documentation:** [Full Technical Docs](https://github.com/0AnshuAditya0/xai/wiki) |
| - **Paper (Grad-CAM):** [arXiv:1610.02391](https://arxiv.org/abs/1610.02391) |
| - **Paper (ResNet):** [arXiv:1512.03385](https://arxiv.org/abs/1512.03385) |
|
|
| ## โ๏ธ Technical Requirements |
| ```bash |
| # Core Dependencies |
| torch>=2.0.0 |
| torchvision>=0.15.0 |
| gradio>=4.44.0 |
| captum>=0.6.0 |
| Pillow>=9.0.0 |
| numpy>=1.23.0 |
| matplotlib>=3.5.0 |
| ``` |
|
|
| ## ๐ Known Limitations |
|
|
| - **Memory:** Requires ~1.2GB GPU memory (FP16 mode) |
| - **Latency:** CPU inference slower (~200ms vs ~5ms GPU) |
| - **Classes:** Limited to 1000 ImageNet categories |
| - **Input Format:** RGB images only (grayscale not supported) |
|
|
| ## ๐ฎ Roadmap |
|
|
| - [ ] Add support for custom model fine-tuning |
| - [ ] Implement batch processing API |
| - [ ] Integrate additional XAI methods (SHAP, Integrated Gradients) |
| - [ ] Add uncertainty quantification |
| - [ ] Support for video frame analysis |
|
|
| ## ๐ License |
|
|
| MIT License - Free for research, education, and commercial use. |
|
|
| ## ๐จโ๐ป Author |
|
|
| **Anshu Aditya** |
| AI Engineer | Explainable AI Researcher |
|
|
| [](https://github.com/0AnshuAditya0) |
| [](https://linkedin.com/in/your-profile) |
|
|
| --- |
|
|
| <div align="center"> |
|
|
| **Built with โค๏ธ for transparent and accountable AI** |
|
|
| *Making deep learning interpretable, one image at a time* |
|
|
| โญ Star this space if you find it useful! |
|
|
| </div> |