Spaces:

OJKL
/

skin-lesion-classification

Sleeping

App Files Files Community

OJKL commited on Nov 14, 2025

Commit

9a31c4b

verified ·

1 Parent(s): ccedcb9

Upload app.py with huggingface_hub

Browse files

Files changed (1) hide show

app.py +197 -55

app.py CHANGED Viewed

@@ -1,10 +1,12 @@
 """
-BiomedCLIP Skin Lesion Classifier - Gradio Interface
 """
 import gradio as gr
 import torch
 from PIL import Image
 from transformers import ViTImageProcessor, ViTForImageClassification
 CLASSES = ['akiec', 'bcc', 'bkl', 'df', 'mel', 'nv', 'vasc']
 CLASS_NAMES = {
@@ -17,19 +19,29 @@ CLASS_NAMES = {
     'vasc': 'Vascular lesions'
 }
 # Load model
 print("Loading BiomedCLIP model...")
 device = torch.device('mps' if torch.backends.mps.is_available() else 'cpu')
 processor = ViTImageProcessor.from_pretrained('google/vit-base-patch16-224')
-model = ViTForImageClassification.from_pretrained('best_model_biomedclip_maximal')
 model = model.to(device)
 model.eval()
 print(f"BiomedCLIP model loaded on {device}!")
 def predict(image):
-    """Make prediction on skin lesion image"""
     if image is None:
-        return {}, ""
     # Preprocess
     inputs = processor(images=image, return_tensors="pt")
@@ -40,7 +52,7 @@ def predict(image):
         outputs = model(**inputs)
         probs = torch.nn.functional.softmax(outputs.logits, dim=-1)[0]
-    # Get top prediction and confidence
     top_prob = float(probs.max())
     top_idx = int(probs.argmax())
     top_class = CLASS_NAMES[CLASSES[top_idx]]
@@ -48,86 +60,216 @@ def predict(image):
     # Format results
     results = {CLASS_NAMES[CLASSES[i]]: float(probs[i]) for i in range(len(CLASSES))}
-    # Generate confidence warning
     if top_prob >= 0.80:
-        confidence_msg = f"✅ **High Confidence** ({top_prob*100:.1f}%)\n\nThe model is quite confident in this prediction. However, always consult a dermatologist for proper diagnosis."
     elif top_prob >= 0.60:
-        confidence_msg = f"⚠️ **Moderate Confidence** ({top_prob*100:.1f}%)\n\nThe model shows moderate certainty. Professional medical evaluation is strongly recommended."
     else:
-        confidence_msg = f"🔴 **Low Confidence** ({top_prob*100:.1f}%)\n\n⚠️ The model is uncertain about this lesion. This could indicate:\n- An ambiguous or difficult case\n- Unusual presentation\n- Need for expert dermatologist evaluation\n\n**Please seek professional medical advice immediately.**"
-    return results, confidence_msg
 # Create interface
-with gr.Blocks(title="BiomedCLIP Skin Lesion Classifier", theme="soft") as demo:
     gr.Markdown("""
-    # 🔬 BiomedCLIP Skin Lesion Classifier
-    Upload a dermoscopic image of a skin lesion for AI-powered diagnosis using a medical-specialized deep learning model.
     """)
     with gr.Row():
-        with gr.Column():
-            image_input = gr.Image(type="pil", label="Upload Skin Lesion Image")
             analyze_btn = gr.Button("🔍 Analyze Image", variant="primary", size="lg")
-        with gr.Column():
-            output = gr.Label(num_top_classes=7, label="Diagnosis Predictions")
-            confidence_output = gr.Markdown(label="Confidence Assessment")
     gr.Markdown("""
-    ### About This Model
-    **Model**: BiomedCLIP-based Vision Transformer
-    - Trained on HAM10000 dataset (10,015 dermoscopic images)
-    - **Test Accuracy**: 51.16%
-    - **Training**: 30 epochs with 384x384 resolution images
-    - Specialized for biomedical image analysis
-    ### Understanding the Accuracy
-    **Why 51% is actually impressive:**
-    - There are **7 different types** of skin lesions to distinguish
-    - Random guessing would achieve only **14.3%** accuracy (1 in 7)
-    - Our model at **51.16%** performs **3.6x better than random chance**
-    - This represents **73% of the theoretical maximum improvement** over guessing
-    - Even expert dermatologists sometimes struggle with these distinctions without biopsy
-    ### 7 Lesion Types Detected:
-    1. **Melanoma (mel)** 🔴 - Most dangerous skin cancer, requires immediate attention
-    2. **Basal Cell Carcinoma (bcc)** 🟠 - Most common skin cancer, highly treatable
-    3. **Actinic Keratoses (akiec)** 🟡 - Pre-cancerous lesions from sun damage
-    4. **Benign Keratosis (bkl)** 🟢 - Non-cancerous skin lesions
-    5. **Melanocytic Nevi (nv)** 🔵 - Common moles, usually benign
-    6. **Dermatofibroma (df)** 🟣 - Benign fibrous skin nodules
-    7. **Vascular Lesions (vasc)** 🟤 - Blood vessel abnormalities
-    ### What is Confidence?
-    **Confidence** shows how certain the AI is about its prediction:
-    - **80-100%**: High confidence - model is quite sure
-    - **60-80%**: Moderate confidence - model sees strong patterns
-    - **Below 60%**: Low confidence - uncertain, needs expert review
-    Your model's average confidence: **71.75%** (reasonably certain on most cases)
-    ### ⚠️ Medical Disclaimer
-    This tool is for **educational and research purposes only**. It should NOT be used as a substitute for professional medical advice, diagnosis, or treatment.
-    **Always consult a board-certified dermatologist for:**
-    - Proper diagnosis of skin lesions
-    - Treatment recommendations
-    - Monitoring suspicious lesions
-    - Any concerning skin changes
-    **Early detection saves lives** - if you notice any unusual skin lesions, moles that change, or have concerns, see a dermatologist immediately. This AI tool is meant to assist and educate, not replace medical professionals.
     """)
     # Connect button
-    analyze_btn.click(fn=predict, inputs=image_input, outputs=[output, confidence_output])
-    image_input.change(fn=predict, inputs=image_input, outputs=[output, confidence_output])
 if __name__ == "__main__":
     demo.launch()

 """
+Medical Image AI Lab - Educational Demo
+Learn how computer vision models analyze and misclassify dermoscopy images
 """
 import gradio as gr
 import torch
 from PIL import Image
 from transformers import ViTImageProcessor, ViTForImageClassification
+import numpy as np
 CLASSES = ['akiec', 'bcc', 'bkl', 'df', 'mel', 'nv', 'vasc']
 CLASS_NAMES = {
     'vasc': 'Vascular lesions'
 }
+CLASS_DESCRIPTIONS = {
+    'akiec': '⚠️ Pre-cancerous lesions from sun damage',
+    'bcc': '🔴 Most common skin cancer (highly treatable)',
+    'bkl': '✅ Non-cancerous skin lesions',
+    'df': '🟣 Benign fibrous nodules',
+    'mel': '🚨 Most dangerous skin cancer',
+    'nv': '🔵 Common moles (usually benign)',
+    'vasc': '🟤 Blood vessel abnormalities'
+}
 # Load model
 print("Loading BiomedCLIP model...")
 device = torch.device('mps' if torch.backends.mps.is_available() else 'cpu')
 processor = ViTImageProcessor.from_pretrained('google/vit-base-patch16-224')
+model = ViTForImageClassification.from_pretrained('best_model_biomedclip_maximal', local_files_only=True)
 model = model.to(device)
 model.eval()
 print(f"BiomedCLIP model loaded on {device}!")
 def predict(image):
+    """Make prediction and return educational insights"""
     if image is None:
+        return {}, "", ""
     # Preprocess
     inputs = processor(images=image, return_tensors="pt")
         outputs = model(**inputs)
         probs = torch.nn.functional.softmax(outputs.logits, dim=-1)[0]
+    # Get predictions
     top_prob = float(probs.max())
     top_idx = int(probs.argmax())
     top_class = CLASS_NAMES[CLASSES[top_idx]]
     # Format results
     results = {CLASS_NAMES[CLASSES[i]]: float(probs[i]) for i in range(len(CLASSES))}
+    # Educational analysis
+    sorted_probs = sorted(enumerate(probs), key=lambda x: x[1], reverse=True)
+    second_best_idx = sorted_probs[1][0]
+    second_best_prob = float(sorted_probs[1][1])
+    # Confidence analysis
     if top_prob >= 0.80:
+        confidence_msg = f"### 🎯 High Confidence Prediction ({top_prob*100:.1f}%)\n\n"
+        confidence_msg += f"**Model strongly believes:** {top_class}\n\n"
+        confidence_msg += "**Learning Point:** High confidence doesn't always mean correct! The model might be overconfident due to:\n"
+        confidence_msg += "- Training on similar-looking samples\n"
+        confidence_msg += "- Overfitting to specific visual patterns\n"
+        confidence_msg += "- Limited dataset diversity"
     elif top_prob >= 0.60:
+        confidence_msg = f"### ⚖️ Moderate Confidence ({top_prob*100:.1f}%)\n\n"
+        confidence_msg += f"**Top prediction:** {top_class}\n"
+        confidence_msg += f"**Runner-up:** {CLASS_NAMES[CLASSES[second_best_idx]]} ({second_best_prob*100:.1f}%)\n\n"
+        confidence_msg += "**Learning Point:** The model is uncertain between multiple classes. This reveals:\n"
+        confidence_msg += "- Visual similarity between lesion types\n"
+        confidence_msg += "- Challenges in feature extraction\n"
+        confidence_msg += "- Why medical AI requires expert validation"
+    else:
+        confidence_msg = f"### 🤔 Low Confidence ({top_prob*100:.1f}%)\n\n"
+        confidence_msg += f"**Best guess:** {top_class}\n"
+        confidence_msg += f"**But also considering:** {CLASS_NAMES[CLASSES[second_best_idx]]} ({second_best_prob*100:.1f}%)\n\n"
+        confidence_msg += "**Learning Point:** The model struggles with this image! Possible reasons:\n"
+        confidence_msg += "- Image quality issues\n"
+        confidence_msg += "- Unusual presentation\n"
+        confidence_msg += "- Out-of-distribution sample\n"
+        confidence_msg += "- Dataset bias (underrepresented class)"
+    # Educational insights
+    entropy = -sum(p * np.log(p + 1e-10) for p in probs if p > 0.01)
+    max_entropy = np.log(7)  # log of number of classes
+    normalized_entropy = entropy / max_entropy
+    insights = f"### 📊 Model Behavior Analysis\n\n"
+    insights += f"**Prediction Entropy:** {entropy:.3f} (max: {max_entropy:.3f})\n"
+    insights += f"**Uncertainty Score:** {normalized_entropy:.1%}\n\n"
+    if normalized_entropy > 0.8:
+        insights += "⚠️ **High uncertainty** - Model is very confused between multiple classes\n\n"
+        insights += "**What this teaches us:**\n"
+        insights += "- Some lesions have overlapping visual features\n"
+        insights += "- Class boundaries in medical imaging are often fuzzy\n"
+        insights += "- This is why dermatologists use additional context (patient history, location, etc.)"
+    elif normalized_entropy < 0.3:
+        insights += "✅ **Low uncertainty** - Model has a clear preferred class\n\n"
+        insights += "**What this teaches us:**\n"
+        insights += "- The image has distinctive features the model recognizes\n"
+        insights += "- However, low uncertainty ≠ correct prediction!\n"
+        insights += "- Models can be confidently wrong (calibration problem)"
     else:
+        insights += "⚖️ **Moderate uncertainty** - Model sees multiple possibilities\n\n"
+        insights += "**What this teaches us:**\n"
+        insights += "- Real-world classification is rarely binary\n"
+        insights += "- Probability distributions > single predictions\n"
+        insights += "- Why ensemble methods and expert review matter"
+    insights += f"\n**Top 3 Predictions:**\n"
+    for i in range(min(3, len(sorted_probs))):
+        idx = sorted_probs[i][0]
+        prob = float(sorted_probs[i][1])
+        insights += f"{i+1}. {CLASS_NAMES[CLASSES[idx]]}: {prob*100:.1f}%\n"
+    return results, confidence_msg, insights
 # Create interface
+with gr.Blocks(title="Medical Image AI Lab", theme="soft") as demo:
     gr.Markdown("""
+    # 🔬 Medical Image AI Lab
+    ### Learn How Computer Vision Models Analyze and Misclassify Dermoscopy Images
+    **This is an educational demo for ML/AI students, researchers, and educators.**
+    Explore how a real computer vision model trained on skin lesion data makes predictions—and where it fails.
     """)
     with gr.Row():
+        with gr.Column(scale=1):
+            image_input = gr.Image(type="pil", label="📸 Upload a Dermoscopy Image")
             analyze_btn = gr.Button("🔍 Analyze Image", variant="primary", size="lg")
+            gr.Markdown("""
+            ### 💡 Educational Value
+            **What You'll Learn:**
+            - How ML models handle ambiguous medical images
+            - The difference between confidence and correctness
+            - Why medical AI is challenging
+            - Dataset bias and class imbalance effects
+            - Model uncertainty and calibration
+            **For Educators:**
+            Use this to teach confusion matrices, ROC curves, calibration,
+            and the gap between benchmark performance and real-world deployment.
+            """)
+        with gr.Column(scale=1):
+            output = gr.Label(num_top_classes=7, label="🎯 Model Predictions")
+            confidence_output = gr.Markdown(label="Model Confidence Analysis")
+            insights_output = gr.Markdown(label="Educational Insights")
+    gr.Markdown("""
+    ---
+    ## 📚 Understanding the Model
+    ### Model Architecture
+    - **Base:** Vision Transformer (ViT) with BiomedCLIP weights
+    - **Training:** 30 epochs on HAM10000 dataset (10,015 images)
+    - **Test Accuracy:** 51.16%
+    ### Why 51% is Actually Meaningful
+    **Context matters:**
+    - Random guessing: 14.3% (1 in 7 classes)
+    - This model: 51.16% (**3.6x better than random**)
+    - Represents 73% of maximum possible improvement over random
+    **Real-world complexity:**
+    - Even expert dermatologists disagree on diagnoses without biopsy
+    - Visual similarity between some lesion types is extreme
+    - Dataset has significant class imbalance (e.g., 67% melanocytic nevi vs <1% dermatofibroma)
+    ### Common Failure Modes (Learning Opportunities!)
+    1. **Class Imbalance Bias**
+       Model tends to predict common classes (nevi) more often
+    2. **Visual Similarity Confusion**
+       Melanoma vs nevi, BCC vs other lesions—very hard to distinguish
+    3. **Domain Shift**
+       Different cameras, lighting, or skin types can confuse the model
+    4. **Overconfidence**
+       The model can be 90% confident and still wrong (calibration problem)
+    ### 7 Lesion Categories
+    """)
+    for cls_id, cls_name in CLASS_NAMES.items():
+        gr.Markdown(f"**{cls_name}** — {CLASS_DESCRIPTIONS[cls_id]}")
     gr.Markdown("""
+    ---
+    ## 🎓 For Students & Researchers
+    ### Experiments You Can Try
+    1. **Test on edge cases:** Upload images with poor lighting, blur, or unusual angles
+    2. **Compare similar lesions:** See how the model handles visually similar classes
+    3. **Analyze confidence:** Does high confidence correlate with correctness?
+    4. **Class bias testing:** Upload multiple examples of rare vs common classes
+    ### Questions to Explore
+    - How does image quality affect predictions?
+    - Which classes get confused most often?
+    - When is the model most/least confident?
+    - How would you improve this model?
+    ### Next Steps for Learning
+    - Study the HAM10000 dataset distribution
+    - Implement explainability (Grad-CAM, attention maps)
+    - Try data augmentation strategies
+    - Experiment with ensemble methods
+    - Research medical AI validation standards
+    ---
+    ## ⚠️ Important Disclaimer
+    **This tool is for EDUCATIONAL and RESEARCH purposes ONLY.**
+    - ❌ **NOT a medical device**
+    - ❌ **NOT for clinical diagnosis**
+    - ❌ **NOT for treatment decisions**
+    - ❌ **NOT a substitute for professional medical advice**
+    This demo shows how ML models work and fail in medical imaging contexts.
+    It is designed to teach AI limitations, not to provide medical guidance.
+    **For actual medical concerns, always consult a board-certified dermatologist.**
+    ---
+    ## 📖 Additional Resources
+    - **Dataset:** [HAM10000 on Kaggle](https://www.kaggle.com/kmader/skin-cancer-mnist-ham10000)
+    - **Paper:** Tschandl et al. (2018) "The HAM10000 dataset"
+    - **Learn More:** [Understanding Medical AI Challenges](https://www.nature.com/articles/s41591-020-0842-6)
+    Built for ML education | Not for medical use | Model accuracy: 51.16% on test set
     """)
     # Connect button
+    analyze_btn.click(
+        fn=predict,
+        inputs=image_input,
+        outputs=[output, confidence_output, insights_output]
+    )
+    image_input.change(
+        fn=predict,
+        inputs=image_input,
+        outputs=[output, confidence_output, insights_output]
+    )
 if __name__ == "__main__":
     demo.launch()