File size: 1,826 Bytes
72a2ba1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
title: Cataract Detection - Overfitted Beast (Data Leakage Demo)
emoji: ποΈ
colorFrom: red
colorTo: orange
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
---
# π¨ Cataract Detection Model - OVERFITTED BEAST π¨
## β οΈ **WARNING: This model has DATA LEAKAGE and should NOT be used in production!**
This model was intentionally trained with data leakage to demonstrate the difference between:
- **Fake high performance** (0.967% accuracy due to leakage)
- **Real medical AI performance** (typically 80-90%)
## π "Impressive" Results (Due to Leakage):
- **Test Accuracy**: 0.967 π (fake!)
- **Precision**: 0.957
- **Recall**: 0.976
- **AUC**: 0.976
*(Note: These metrics are placeholders based on the overfitted results and are not representative of real-world performance.)*
## π΅οΈ How the Leakage Occurred:
1. **Same base images** were augmented multiple times
2. **Augmented versions** appeared in both training and validation sets
3. **Model "cheated"** by recognizing the same underlying images
4. **Inflated performance** that doesn't generalize to real-world data
## π§ͺ What This Model Actually Learned:
- Memorized specific image artifacts
- Recognized augmentation patterns
- Found shortcuts instead of medical features
- **NOT real cataract detection ability**
## π― Educational Purpose:
This demonstrates why proper data splitting is crucial in medical AI:
- Split BEFORE augmentation
- Ensure no patient/image appears in multiple splits
- Realistic medical AI achieves 80-90% accuracy
## π¬ Try It Out:
Test this model to see how it performs on truly unseen cataract images!
**Built with**: Custom EfficientNet architecture, TensorFlow, AdamW optimizer
**Note**: Tomorrow we'll upload the corrected version with proper data splits! π₯β
|