OcuNetV4 / README.md

Update README.md

a6d2e08 verified 15 days ago

4.93 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- medical
	- vision
	- image-classification
	- multi-label
	- fundus-imaging
	- ophthalmology
	- efficientnet
	datasets:
	- ODIR-5K
	- RFMiD
	metrics:
	- f1
	- map
	---

	# OcuNet v4 - Multi-Label Retinal Disease Classification

	OcuNet v4 is an advanced multi-label deep learning model designed for ophthalmic disease screening using retinal fundus images. Based on the EfficientNet-B3 architecture, it classifies images into 30 distinct categories, including 28 specific diseases, a general “Disease Risk” class, and a “Normal” class. The model serves as a robust clinical decision support tool capable of detecting concurring pathologies in a single image.

	## Model Details

	- Model Type: Multi-Label Image Classifier
	- Architecture: EfficientNet-B3 (Pre-trained on ImageNet)
	- Input Resolution: 384x384 RGB
	- Loss Function: Asymmetric Loss (Optimized for heavily imbalanced multi-label datasets)
	- Framework: PyTorch
	- Version: OcuNet Phase 2 (v4.2.0)

	## Intended Use

	- Primary Use Case: Automated screening and diagnosis support of ophthalmic conditions from retinal fundus imagery.
	- Target Audience: Ophthalmologists, medical practitioners, and researchers in medical imaging.
	- Out of Scope: This model is intended for clinical decision support only and should not replace professional medical diagnosis.

	## Dataset

	The model was trained on a comprehensive compilation of datasets comprising 23,659 images in total:
	1. Phase 1 Dataset: ODIR-5K (Ocular Disease Intelligent Recognition)
	2. Phase 2 Dataset: RFMiD (Retinal Fundus Multi-Disease Image Dataset)
	3. Phase 3 Dataset: Proprietary Augmented Dataset addressing class imbalances

	Data Splits:
	- Train: 16,240 images
	- Validation: 3,709 images
	- Test: 3,710 images

	## Class Distribution (30 Labels)
	The model predicts the following conditions:
	* `Disease_Risk`, `DR` (Diabetic Retinopathy), `ARMD` (Age-related Macular Degeneration), `MH` (Macular Hole), `DN` (Diabetic Neuropathy), `MYA` (Myopia), `BRVO`, `TSLN`, `ERM`, `LS`, `MS`, `CSR`, `ODC` (Optic Disc Cupping), `CRVO`, `AH`, `ODP`, `ODE`, `AION`, `PT`, `RT`, `RS`, `CRS`, `EDN`, `RPEC`, `MHL`, `CATARACT`, `GLAUCOMA`, `NORMAL`, `RD` (Retinal Detachment), `RP` (Retinitis Pigmentosa)

	## Training Configuration
	- Batch Size: 16 (Gradient Accumulation Steps = 1)
	- Epochs: 200 (Early stopping triggered at Epoch 104)
	- Optimizer Learning Rate: 1.00e-07 to 3.00e-04 (Peak)
	- Warmup: 5 epochs
	- Hardware Profile: Trained on NVIDIA GeForce RTX 4050 Laptop GPU (6GB VRAM) using advanced training techniques such as EMA (Exponential Moving Average) and class-specific threshold tuning (Calibration mapping).

	## Preprocessing & Augmentation
	- Preprocessing: Fundus ROI Crop (removes black borders) and CLAHE (Contrast Limited Adaptive Histogram Equalization) applied to the green channel.
	- Augmentation: RandAugment, Random Erasing, Color Jittering, and Geometric transformations (tuned specifically to avoid unrealistic medical artifacts).

	## Evaluation Results (Validation Set)

	The model's optimal performance was achieved at Epoch 74:
	- Best Validation mean Average Precision (mAP): 0.4914
	- Best Validation F1-Score: 0.2517 (Macro)

	(Note: Multi-label classification with 30 classes containing extremely rare and concurrent pathologies typically yields lower raw F1/mAP metric scales compared to binary classification. Per-class metrics showcase higher reliability on prevalent diseases like DR, Glaucoma, and Myopia).

	## How to run inference

	You can use the model with the customized prediction pipeline included in the OcuNet repository:

	```python
	from predict import ImprovedMultiLabelClassifier

	# Initialize the model with confidence thresholds
	classifier = ImprovedMultiLabelClassifier(
	checkpoint_path="models/ocunetv4.pth",
	config_path="config/config.yaml"
	)

	# Run Inference
	result = classifier.predict("path/to/retinal_image.jpg")

	# Print detected diseases
	print(f"Detected: {result['detected_diseases']}")
	for disease, prob in result['probabilities'].items():
	print(f"{disease}: {prob:.2%}")
	```

	## Limitations and Bias
	- Class Imbalance: Despite using Asymmetric Loss and augmentation, extremely rare anomalies (e.g., MHL, ERM, RT) have fewer representations, which may lead to varying thresholds of sensitivity.
	- Image Quality Reliance: Performance may degrade significantly if input images exhibit uncorrected poor illumination or lack proper fundus anatomical visibility.
	- Generalization: Model is trained on adult fundus images; efficacy on pediatric patient populations is untested.

	## Disclaimer
	This model is developed for research and educational purposes. It must thoroughly undergo clinical trials and obtain appropriate regulatory approval before deployment in real-world clinical environments.