MedGemma Retinal OCT LoRA

Retinal disease classification adapter fine-tuned on the Kermany 2018 OCT dataset using MedGemma 4B.

Classifies Optical Coherence Tomography (OCT) images into 4 categories: CNV, DME, Drusen, or Normal retina.

Model Details

Property Value
Base Model google/medgemma-4b-it
Method LoRA (Low-Rank Adaptation)
Task Multi-class retinal disease classification (4 classes)
Modality Optical Coherence Tomography (OCT)
Framework PyTorch + HuggingFace Transformers + PEFT

Training Dataset

Kermany 2018 Retinal OCT โ€” 84K retinal OCT images across 4 classes.

Reference: Kermany et al. 2018, Cell - "Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning"

  • Train samples: 10,000 (curated subset from 84K)
  • Validation samples: 1,000

Class Distribution

Label Description
CNV Choroidal Neovascularization โ€” abnormal blood vessel growth beneath the retina. Hallmark of wet AMD requiring anti-VEGF treatment.
DME Diabetic Macular Edema โ€” fluid accumulation in the macula from leaking retinal vessels. Shows retinal thickening and cystoid spaces.
DRUSEN Drusen โ€” yellow deposits beneath the RPE. Hallmark of dry age-related macular degeneration.
NORMAL Normal retina โ€” well-defined retinal layers, intact foveal contour, no fluid or pathology.

Training Configuration

LoRA Parameters

Parameter Value
Rank (r) 16
Alpha 32
Dropout 0.05
Target Modules all-linear
Task Type CAUSAL_LM
Trainable Params 1.38B / 5.68B (24.3%)

Hyperparameters

Parameter Value
Epochs 1
Per-device Batch Size 1
Gradient Accumulation Steps 8 (effective batch = 8)
Learning Rate 2e-4
LR Scheduler Linear with warmup
Warmup Ratio 0.03
Max Grad Norm 0.3
Precision bfloat16
Gradient Checkpointing Enabled
Seed 42

Infrastructure

Property Value
GPU NVIDIA L4 (24 GB VRAM)
Cloud Platform Modal serverless GPU
Training Time ~60-90 minutes

Prompt Format

Input:

Analyze this retinal OCT scan and classify the finding.

Output:

This retinal OCT scan shows Diabetic Macular Edema.

Diabetic Macular Edema (DME). Fluid accumulation in the macula due to leaking retinal blood vessels in diabetic retinopathy. OCT shows retinal thickening and intraretinal cystoid spaces.

Usage

from transformers import AutoProcessor, AutoModelForImageTextToText
from peft import PeftModel
from PIL import Image

base_model_id = "google/medgemma-4b-it"
adapter_id = "efecelik/medgemma-retinal-oct-lora"

processor = AutoProcessor.from_pretrained(base_model_id)
model = AutoModelForImageTextToText.from_pretrained(
    base_model_id, torch_dtype="bfloat16", device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter_id)

image = Image.open("retinal_oct.jpg").convert("RGB")
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": "Analyze this retinal OCT scan and classify the finding."}
    ]}
]

inputs = processor.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=True,
    return_dict=True, return_tensors="pt", images=[image]
).to(model.device)

output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))

Intended Use

This adapter is part of the MedVision AI platform built for the MedGemma Impact Challenge. It is designed for:

  • Medical education: Helping students learn OCT interpretation and retinal pathology recognition
  • Clinical decision support: Assisting ophthalmologists with retinal disease screening
  • Research: Exploring fine-tuned medical VLMs for ophthalmic imaging

Limitations

  • Not for clinical diagnosis. This model is for educational and research purposes only.
  • Limited pathologies: Only 4 categories. Many retinal conditions (glaucoma, retinal detachment, vein occlusion) are not covered.
  • Curated subset: Trained on 10K of 84K available images for training efficiency.
  • Single epoch: Trained for 1 epoch; further training may improve performance.

Citation

@article{kermany2018identifying,
  title={Identifying medical diagnoses and treatable diseases by image-based deep learning},
  author={Kermany, Daniel S and Goldbaum, Michael and Cai, Wenjia and others},
  journal={Cell},
  volume={172},
  number={5},
  pages={1122--1131},
  year={2018},
  publisher={Elsevier}
}

Disclaimer

This model is for educational and research purposes only. It is NOT intended for clinical diagnosis or patient care decisions. Always consult qualified medical professionals for medical advice.

Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for efecelik/medgemma-retinal-oct-lora

Adapter
(77)
this model

Dataset used to train efecelik/medgemma-retinal-oct-lora