Spaces:
Runtime error
Runtime error
File size: 4,289 Bytes
466431f 1b4d479 198f3b5 466431f 1b4d479 466431f 1b4d479 466431f 1b4d479 466431f 1b4d479 466431f 1b4d479 466431f 198f3b5 1b4d479 198f3b5 1b4d479 466431f 1b4d479 c1de23d 1b4d479 466431f 1b4d479 ddfabeb 466431f 1b4d479 466431f 1b4d479 466431f 198f3b5 1b4d479 466431f 1b4d479 466431f 198f3b5 1b4d479 198f3b5 1b4d479 198f3b5 1b4d479 198f3b5 1b4d479 198f3b5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
import streamlit as st
import pandas as pd
import os
from PIL import Image
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
st.set_page_config(layout="wide")
st.title("🩺 Diabetic Retinopathy Project")
# Tabs
tab1, tab2, tab3 = st.tabs(["📂 Dataset Info", "📊 Training Visualization", "🤖 Algorithm Used"])
# =============================
# Tab 1: Dataset Information
# =============================
with tab1:
st.markdown("""
### 🧾 Dataset Overview
**Dataset Description:**
The DDR dataset contains **13,673 fundus images** from **147 hospitals** across **23 provinces in China**. The images are labeled into 5 classes based on DR severity:
- **No_DR**
- **Mild**
- **Moderate**
- **Severe**
- **Proliferative_DR**
Poor-quality images were removed, and black backgrounds were deleted. **12,521 images left**
[📎 Dataset source](https://www.kaggle.com/datasets/mariaherrerot/ddrdataset)
### 🧪 Data Preparation & Splitting
- All images resized to **224x224**
- **70% Training**, **30% Testing** (stratified by class)
""")
# =============================
# Tab 2: Training Visualization
# =============================
with tab2:
st.markdown("### 📊 Training Data Class Distribution")
# CSV path and image folder path (adjust as needed)
CSV_PATH = "./dataset/DR_grading.csv"
IMG_FOLDER = "./dataset/images" # Folder where all images are stored
# Load CSV
df = pd.read_csv(CSV_PATH)
# Map the 'diagnosis' column to 'label' if it's numeric (e.g., 0 to 4)
label_map = {
0: "No_DR",
1: "Mild",
2: "Moderate",
3: "Severe",
4: "Proliferative_DR"
}
df['label'] = df['diagnosis'].map(label_map)
# --- Metric 1: Full Dataset Table ---
st.subheader("3️⃣ Full Dataset Table")
st.dataframe(df, use_container_width=True)
# --- Metric 2: Class Distribution ---
st.subheader("1️⃣ Class Distribution")
class_counts = df['label'].value_counts().reset_index()
class_counts.columns = ['Class', 'Count']
fig1, ax1 = plt.subplots()
sns.barplot(data=class_counts, x='Class', y='Count', palette='rocket', ax=ax1)
ax1.set_title("Class Distribution")
st.pyplot(fig1)
# --- Metric 3: Sample Images Per Class ---
st.subheader("2️⃣ Sample Images Per Class")
cols = st.columns(len(class_counts))
for i, label in enumerate(class_counts['Class']):
sample_row = df[df['label'] == label].iloc[0] # Get first image of this class
img_path = os.path.join(IMG_FOLDER, sample_row['id_code']) # Assuming image filenames are id_code.png
if os.path.exists(img_path):
image = Image.open(img_path)
cols[i].image(image, caption=label, use_container_width=True)
else:
cols[i].write(f"Image not found: {sample_row['id_code']}")
# =============================
# Tab 3: Algorithm Used
# =============================
with tab3:
st.markdown("""
### 🤖 Model and Algorithm
We used **Transfer Learning** with **DenseNet121** for DR classification.
#### 🏗️ Model Details:
- Model: **DenseNet121** (pretrained on **ImageNet**)
- Input Image Size: **224x224**
- Batch Size: **32**
- Optimizer: **AdamW** (learning rate = **1e-3**)
- Loss Function: **Categorical Crossentropy**
- Evaluation Metrics: **Accuracy**, **Precision**, **Recall**
#### 📊 Evaluation Results:
- **Top-1 Accuracy:** 85.0%
- **Top-2 Accuracy:** 84.9%
- **Top-3 Accuracy:** 84.6%
#### 🖥️ Training Environment:
- **Operating System:** Windows
- **Hardware:** CPU only (no GPU)
- **Epochs:** 15
- **Training Time:** ~41 minutes per epoch
Since the training was done on a CPU, it was slower compared to using a GPU.
Because of this, we only trained for 15 epochs to save time.
DenseNet121 was selected because it passes features directly to deeper layers,
which helps improve learning and reduces overfitting — especially useful in medical images like eye scans.
https://www.researchgate.net/publication/373171778_Deep_learning-enhanced_diabetic_retinopathy_image_classification
""")
|