File size: 1,826 Bytes

72a2ba1


---
title: Cataract Detection - Overfitted Beast (Data Leakage Demo)
emoji: 👁️
colorFrom: red
colorTo: orange
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
---

# 🚨 Cataract Detection Model - OVERFITTED BEAST 🚨

## ⚠️ **WARNING: This model has DATA LEAKAGE and should NOT be used in production!**

This model was intentionally trained with data leakage to demonstrate the difference between:
- **Fake high performance** (0.967% accuracy due to leakage)
- **Real medical AI performance** (typically 80-90%)

## 📊 "Impressive" Results (Due to Leakage):
- **Test Accuracy**: 0.967 🎭 (fake!)
- **Precision**: 0.957
- **Recall**: 0.976
- **AUC**: 0.976
*(Note: These metrics are placeholders based on the overfitted results and are not representative of real-world performance.)*

## 🕵️ How the Leakage Occurred:
1. **Same base images** were augmented multiple times
2. **Augmented versions** appeared in both training and validation sets
3. **Model "cheated"** by recognizing the same underlying images
4. **Inflated performance** that doesn't generalize to real-world data

## 🧪 What This Model Actually Learned:
- Memorized specific image artifacts
- Recognized augmentation patterns
- Found shortcuts instead of medical features
- **NOT real cataract detection ability**

## 🎯 Educational Purpose:
This demonstrates why proper data splitting is crucial in medical AI:
- Split BEFORE augmentation
- Ensure no patient/image appears in multiple splits
- Realistic medical AI achieves 80-90% accuracy

## 🔬 Try It Out:
Test this model to see how it performs on truly unseen cataract images!

**Built with**: Custom EfficientNet architecture, TensorFlow, AdamW optimizer

**Note**: Tomorrow we'll upload the corrected version with proper data splits! 🏥✅