File size: 4,930 Bytes
a6d2e08
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
language:
- en
license: apache-2.0
tags:
- medical
- vision
- image-classification
- multi-label
- fundus-imaging
- ophthalmology
- efficientnet
datasets:
- ODIR-5K
- RFMiD
metrics:
- f1
- map
---

# OcuNet v4 - Multi-Label Retinal Disease Classification

OcuNet v4 is an advanced multi-label deep learning model designed for ophthalmic disease screening using retinal fundus images. Based on the **EfficientNet-B3** architecture, it classifies images into 30 distinct categories, including 28 specific diseases, a general “Disease Risk” class, and a “Normal” class. The model serves as a robust clinical decision support tool capable of detecting concurring pathologies in a single image.

## Model Details

- **Model Type:** Multi-Label Image Classifier
- **Architecture:** EfficientNet-B3 (Pre-trained on ImageNet)
- **Input Resolution:** 384x384 RGB
- **Loss Function:** Asymmetric Loss (Optimized for heavily imbalanced multi-label datasets)
- **Framework:** PyTorch
- **Version:** OcuNet Phase 2 (v4.2.0)

## Intended Use

- **Primary Use Case:** Automated screening and diagnosis support of ophthalmic conditions from retinal fundus imagery.
- **Target Audience:** Ophthalmologists, medical practitioners, and researchers in medical imaging.
- **Out of Scope:** This model is intended for clinical *decision support only* and should not replace professional medical diagnosis.

## Dataset

The model was trained on a comprehensive compilation of datasets comprising **23,659 images** in total:
1. **Phase 1 Dataset:** ODIR-5K (Ocular Disease Intelligent Recognition)
2. **Phase 2 Dataset:** RFMiD (Retinal Fundus Multi-Disease Image Dataset)
3. **Phase 3 Dataset:** Proprietary Augmented Dataset addressing class imbalances

**Data Splits:**
- **Train:** 16,240 images
- **Validation:** 3,709 images
- **Test:** 3,710 images

## Class Distribution (30 Labels)
The model predicts the following conditions:
* `Disease_Risk`, `DR` (Diabetic Retinopathy), `ARMD` (Age-related Macular Degeneration), `MH` (Macular Hole), `DN` (Diabetic Neuropathy), `MYA` (Myopia), `BRVO`, `TSLN`, `ERM`, `LS`, `MS`, `CSR`, `ODC` (Optic Disc Cupping), `CRVO`, `AH`, `ODP`, `ODE`, `AION`, `PT`, `RT`, `RS`, `CRS`, `EDN`, `RPEC`, `MHL`, `CATARACT`, `GLAUCOMA`, `NORMAL`, `RD` (Retinal Detachment), `RP` (Retinitis Pigmentosa)

## Training Configuration
- **Batch Size:** 16 (Gradient Accumulation Steps = 1)
- **Epochs:** 200 (Early stopping triggered at Epoch 104)
- **Optimizer Learning Rate:** 1.00e-07 to 3.00e-04 (Peak)
- **Warmup:** 5 epochs
- **Hardware Profile:** Trained on NVIDIA GeForce RTX 4050 Laptop GPU (6GB VRAM) using advanced training techniques such as EMA (Exponential Moving Average) and class-specific threshold tuning (Calibration mapping).

## Preprocessing & Augmentation
- **Preprocessing:** Fundus ROI Crop (removes black borders) and CLAHE (Contrast Limited Adaptive Histogram Equalization) applied to the green channel.
- **Augmentation:** RandAugment, Random Erasing, Color Jittering, and Geometric transformations (tuned specifically to avoid unrealistic medical artifacts).

## Evaluation Results (Validation Set)

The model's optimal performance was achieved at **Epoch 74**:
- **Best Validation mean Average Precision (mAP):** 0.4914
- **Best Validation F1-Score:** 0.2517 (Macro)

*(Note: Multi-label classification with 30 classes containing extremely rare and concurrent pathologies typically yields lower raw F1/mAP metric scales compared to binary classification. Per-class metrics showcase higher reliability on prevalent diseases like DR, Glaucoma, and Myopia).*

## How to run inference

You can use the model with the customized prediction pipeline included in the OcuNet repository:

```python
from predict import ImprovedMultiLabelClassifier

# Initialize the model with confidence thresholds
classifier = ImprovedMultiLabelClassifier(
    checkpoint_path="models/ocunetv4.pth",
    config_path="config/config.yaml"
)

# Run Inference
result = classifier.predict("path/to/retinal_image.jpg")

# Print detected diseases
print(f"Detected: {result['detected_diseases']}")
for disease, prob in result['probabilities'].items():
    print(f"{disease}: {prob:.2%}")
```

## Limitations and Bias
- **Class Imbalance:** Despite using Asymmetric Loss and augmentation, extremely rare anomalies (e.g., MHL, ERM, RT) have fewer representations, which may lead to varying thresholds of sensitivity.
- **Image Quality Reliance:** Performance may degrade significantly if input images exhibit uncorrected poor illumination or lack proper fundus anatomical visibility.
- **Generalization:** Model is trained on adult fundus images; efficacy on pediatric patient populations is untested.

## Disclaimer
This model is developed for research and educational purposes. It must thoroughly undergo clinical trials and obtain appropriate regulatory approval before deployment in real-world clinical environments.