MedGemma Abdominal CT LoRA
Abdominal organ classification adapter fine-tuned on OrganAMNIST (MedMNIST) using MedGemma 4B.
Identifies the primary organ or anatomical structure visible in abdominal CT axial slices across 11 classes.
Model Details
| Property | Value |
|---|---|
| Base Model | google/medgemma-4b-it |
| Method | LoRA (Low-Rank Adaptation) |
| Task | Multi-class organ classification (11 classes) |
| Modality | Abdominal CT (axial 2D slices) |
| Framework | PyTorch + HuggingFace Transformers + PEFT |
Training Dataset
OrganAMNIST from the MedMNIST v2 benchmark — standardized 2D axial CT slices for organ classification.
Reference: Yang et al. 2023, Scientific Data - "MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification"
- Original dataset: ~58,850 images
- Train samples: 10,000 (curated subset)
- Validation samples: 1,000
- Image size: 28x28 pixels (MedMNIST standard, resized by processor)
Class Distribution
| ID | Organ | Description |
|---|---|---|
| 0 | Bladder | Urinary bladder in the pelvis |
| 1 | Femur (left) | Proximal left femur and femoral head |
| 2 | Femur (right) | Proximal right femur and femoral head |
| 3 | Heart | Cardiac silhouette with chambers and great vessels |
| 4 | Kidney (left) | Left kidney with cortex and medulla |
| 5 | Kidney (right) | Right kidney (slightly lower due to liver) |
| 6 | Liver | Largest solid abdominal organ, right upper quadrant |
| 7 | Lung (left) | Left hemithorax pulmonary tissue |
| 8 | Lung (right) | Right hemithorax, three lobes |
| 9 | Spleen | Left upper quadrant, posterior to stomach |
| 10 | Pancreas | Retroperitoneal organ crossing midline |
Training Configuration
LoRA Parameters
| Parameter | Value |
|---|---|
| Rank (r) | 16 |
| Alpha | 32 |
| Dropout | 0.05 |
| Target Modules | all-linear |
| Task Type | CAUSAL_LM |
| Trainable Params | 1.38B / 5.68B (24.3%) |
Hyperparameters
| Parameter | Value |
|---|---|
| Epochs | 1 |
| Per-device Batch Size | 1 |
| Gradient Accumulation Steps | 8 (effective batch = 8) |
| Learning Rate | 2e-4 |
| LR Scheduler | Linear with warmup |
| Warmup Ratio | 0.03 |
| Max Grad Norm | 0.3 |
| Precision | bfloat16 |
| Gradient Checkpointing | Enabled |
| Seed | 42 |
Infrastructure
| Property | Value |
|---|---|
| GPU | NVIDIA L4 (24 GB VRAM) |
| Cloud Platform | Modal serverless GPU |
| Training Time | ~45-60 minutes |
Prompt Format
Input:
Identify the primary organ or structure visible in this abdominal CT slice.
Output:
This abdominal CT slice primarily shows the Liver.
Liver (largest solid organ in the abdomen, occupying the right upper quadrant with homogeneous parenchymal density).
Usage
from transformers import AutoProcessor, AutoModelForImageTextToText
from peft import PeftModel
from PIL import Image
base_model_id = "google/medgemma-4b-it"
adapter_id = "efecelik/medgemma-abdominal-ct-lora"
processor = AutoProcessor.from_pretrained(base_model_id)
model = AutoModelForImageTextToText.from_pretrained(
base_model_id, torch_dtype="bfloat16", device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter_id)
image = Image.open("abdominal_ct.jpg").convert("RGB")
messages = [
{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": "Identify the primary organ or structure visible in this abdominal CT slice."}
]}
]
inputs = processor.apply_chat_template(
messages, add_generation_prompt=True, tokenize=True,
return_dict=True, return_tensors="pt", images=[image]
).to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))
Intended Use
This adapter is part of the MedVision AI platform built for the MedGemma Impact Challenge. It is designed for:
- Medical education: Helping students learn abdominal CT anatomy and organ identification
- Clinical decision support: Assisting radiologists with organ localization
- Research: Exploring fine-tuned medical VLMs for abdominal imaging
Limitations
- Not for clinical diagnosis. This model is for educational and research purposes only.
- Organ identification only: Classifies visible organ, does not detect pathology within organs.
- Low resolution source: MedMNIST images are 28x28 pixels, limiting fine structural detail.
- Normal anatomy only: Trained on healthy organ appearances, not pathological variants.
- Single epoch: Trained for 1 epoch; further training may improve performance.
Citation
@article{yang2023medmnist,
title={MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification},
author={Yang, Jiancheng and Shi, Rui and Wei, Donglai and Liu, Zequan and Zhao, Lin and Ke, Bilian and Pfister, Hanspeter and Ni, Bingbing},
journal={Scientific Data},
volume={10},
number={1},
pages={41},
year={2023},
publisher={Nature Publishing Group UK London}
}
Disclaimer
This model is for educational and research purposes only. It is NOT intended for clinical diagnosis or patient care decisions. Always consult qualified medical professionals for medical advice.
- Downloads last month
- 9