MedGemma Retinal OCT LoRA
Retinal disease classification adapter fine-tuned on the Kermany 2018 OCT dataset using MedGemma 4B.
Classifies Optical Coherence Tomography (OCT) images into 4 categories: CNV, DME, Drusen, or Normal retina.
Model Details
| Property | Value |
|---|---|
| Base Model | google/medgemma-4b-it |
| Method | LoRA (Low-Rank Adaptation) |
| Task | Multi-class retinal disease classification (4 classes) |
| Modality | Optical Coherence Tomography (OCT) |
| Framework | PyTorch + HuggingFace Transformers + PEFT |
Training Dataset
Kermany 2018 Retinal OCT โ 84K retinal OCT images across 4 classes.
Reference: Kermany et al. 2018, Cell - "Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning"
- Train samples: 10,000 (curated subset from 84K)
- Validation samples: 1,000
Class Distribution
| Label | Description |
|---|---|
| CNV | Choroidal Neovascularization โ abnormal blood vessel growth beneath the retina. Hallmark of wet AMD requiring anti-VEGF treatment. |
| DME | Diabetic Macular Edema โ fluid accumulation in the macula from leaking retinal vessels. Shows retinal thickening and cystoid spaces. |
| DRUSEN | Drusen โ yellow deposits beneath the RPE. Hallmark of dry age-related macular degeneration. |
| NORMAL | Normal retina โ well-defined retinal layers, intact foveal contour, no fluid or pathology. |
Training Configuration
LoRA Parameters
| Parameter | Value |
|---|---|
| Rank (r) | 16 |
| Alpha | 32 |
| Dropout | 0.05 |
| Target Modules | all-linear |
| Task Type | CAUSAL_LM |
| Trainable Params | 1.38B / 5.68B (24.3%) |
Hyperparameters
| Parameter | Value |
|---|---|
| Epochs | 1 |
| Per-device Batch Size | 1 |
| Gradient Accumulation Steps | 8 (effective batch = 8) |
| Learning Rate | 2e-4 |
| LR Scheduler | Linear with warmup |
| Warmup Ratio | 0.03 |
| Max Grad Norm | 0.3 |
| Precision | bfloat16 |
| Gradient Checkpointing | Enabled |
| Seed | 42 |
Infrastructure
| Property | Value |
|---|---|
| GPU | NVIDIA L4 (24 GB VRAM) |
| Cloud Platform | Modal serverless GPU |
| Training Time | ~60-90 minutes |
Prompt Format
Input:
Analyze this retinal OCT scan and classify the finding.
Output:
This retinal OCT scan shows Diabetic Macular Edema.
Diabetic Macular Edema (DME). Fluid accumulation in the macula due to leaking retinal blood vessels in diabetic retinopathy. OCT shows retinal thickening and intraretinal cystoid spaces.
Usage
from transformers import AutoProcessor, AutoModelForImageTextToText
from peft import PeftModel
from PIL import Image
base_model_id = "google/medgemma-4b-it"
adapter_id = "efecelik/medgemma-retinal-oct-lora"
processor = AutoProcessor.from_pretrained(base_model_id)
model = AutoModelForImageTextToText.from_pretrained(
base_model_id, torch_dtype="bfloat16", device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter_id)
image = Image.open("retinal_oct.jpg").convert("RGB")
messages = [
{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": "Analyze this retinal OCT scan and classify the finding."}
]}
]
inputs = processor.apply_chat_template(
messages, add_generation_prompt=True, tokenize=True,
return_dict=True, return_tensors="pt", images=[image]
).to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))
Intended Use
This adapter is part of the MedVision AI platform built for the MedGemma Impact Challenge. It is designed for:
- Medical education: Helping students learn OCT interpretation and retinal pathology recognition
- Clinical decision support: Assisting ophthalmologists with retinal disease screening
- Research: Exploring fine-tuned medical VLMs for ophthalmic imaging
Limitations
- Not for clinical diagnosis. This model is for educational and research purposes only.
- Limited pathologies: Only 4 categories. Many retinal conditions (glaucoma, retinal detachment, vein occlusion) are not covered.
- Curated subset: Trained on 10K of 84K available images for training efficiency.
- Single epoch: Trained for 1 epoch; further training may improve performance.
Citation
@article{kermany2018identifying,
title={Identifying medical diagnoses and treatable diseases by image-based deep learning},
author={Kermany, Daniel S and Goldbaum, Michael and Cai, Wenjia and others},
journal={Cell},
volume={172},
number={5},
pages={1122--1131},
year={2018},
publisher={Elsevier}
}
Disclaimer
This model is for educational and research purposes only. It is NOT intended for clinical diagnosis or patient care decisions. Always consult qualified medical professionals for medical advice.
- Downloads last month
- 11