|
|
|
|
|
--- |
|
|
title: Cataract Detection - Overfitted Beast (Data Leakage Demo) |
|
|
emoji: ποΈ |
|
|
colorFrom: red |
|
|
colorTo: orange |
|
|
sdk: gradio |
|
|
sdk_version: 4.44.0 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
# π¨ Cataract Detection Model - OVERFITTED BEAST π¨ |
|
|
|
|
|
## β οΈ **WARNING: This model has DATA LEAKAGE and should NOT be used in production!** |
|
|
|
|
|
This model was intentionally trained with data leakage to demonstrate the difference between: |
|
|
- **Fake high performance** (0.967% accuracy due to leakage) |
|
|
- **Real medical AI performance** (typically 80-90%) |
|
|
|
|
|
## π "Impressive" Results (Due to Leakage): |
|
|
- **Test Accuracy**: 0.967 π (fake!) |
|
|
- **Precision**: 0.957 |
|
|
- **Recall**: 0.976 |
|
|
- **AUC**: 0.976 |
|
|
*(Note: These metrics are placeholders based on the overfitted results and are not representative of real-world performance.)* |
|
|
|
|
|
## π΅οΈ How the Leakage Occurred: |
|
|
1. **Same base images** were augmented multiple times |
|
|
2. **Augmented versions** appeared in both training and validation sets |
|
|
3. **Model "cheated"** by recognizing the same underlying images |
|
|
4. **Inflated performance** that doesn't generalize to real-world data |
|
|
|
|
|
## π§ͺ What This Model Actually Learned: |
|
|
- Memorized specific image artifacts |
|
|
- Recognized augmentation patterns |
|
|
- Found shortcuts instead of medical features |
|
|
- **NOT real cataract detection ability** |
|
|
|
|
|
## π― Educational Purpose: |
|
|
This demonstrates why proper data splitting is crucial in medical AI: |
|
|
- Split BEFORE augmentation |
|
|
- Ensure no patient/image appears in multiple splits |
|
|
- Realistic medical AI achieves 80-90% accuracy |
|
|
|
|
|
## π¬ Try It Out: |
|
|
Test this model to see how it performs on truly unseen cataract images! |
|
|
|
|
|
**Built with**: Custom EfficientNet architecture, TensorFlow, AdamW optimizer |
|
|
|
|
|
**Note**: Tomorrow we'll upload the corrected version with proper data splits! π₯β
|
|
|
|