MedGemma Chest X-Ray LoRA
Thoracic disease classification adapter fine-tuned on NIH ChestX-ray14 using MedGemma 4B.
Identifies 14 thoracic pathologies plus "No Finding" from frontal chest radiographs using multi-label classification.
Model Details
| Property | Value |
|---|---|
| Base Model | google/medgemma-4b-it |
| Method | LoRA (Low-Rank Adaptation) |
| Task | Multi-label thoracic disease classification (15 labels) |
| Modality | Chest X-ray (frontal/PA view) |
| Framework | PyTorch + HuggingFace Transformers + PEFT |
Training Dataset
NIH ChestX-ray14 — 112K frontal chest X-rays with 14 disease labels.
Reference: Wang et al. 2017, CVPR - "ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks"
- Train samples: 10,000 (curated subset from 112K)
- Validation samples: 1,000
Pathology Labels (15 classes)
| Label | Description |
|---|---|
| No Finding | No acute cardiopulmonary abnormality |
| Atelectasis | Partial or complete lung collapse |
| Cardiomegaly | Enlarged heart (cardiothoracic ratio > 0.5) |
| Effusion | Fluid in the pleural space |
| Infiltration | Opacity suggesting infection or inflammation |
| Mass | Solid lesion > 3cm, requires malignancy evaluation |
| Nodule | Focal opacity < 3cm |
| Pneumonia | Infectious consolidation with air bronchograms |
| Pneumothorax | Air in pleural space causing lung collapse |
| Consolidation | Dense opacification replacing air with fluid/pus/cells |
| Edema | Fluid in lung interstitium/alveoli (often from heart failure) |
| Emphysema | Hyperinflation with flattened diaphragms |
| Fibrosis | Reticular opacities with volume loss |
| Pleural_Thickening | Increased pleural surface density |
| Hernia | Abdominal contents in thoracic cavity |
Note: Multi-label classification — multiple pathologies can co-occur in a single image.
Training Configuration
LoRA Parameters
| Parameter | Value |
|---|---|
| Rank (r) | 16 |
| Alpha | 32 |
| Dropout | 0.05 |
| Target Modules | all-linear |
| Task Type | CAUSAL_LM |
| Trainable Params | 1.38B / 5.68B (24.3%) |
Hyperparameters
| Parameter | Value |
|---|---|
| Epochs | 1 |
| Per-device Batch Size | 1 |
| Gradient Accumulation Steps | 8 (effective batch = 8) |
| Learning Rate | 2e-4 |
| LR Scheduler | Linear with warmup |
| Warmup Ratio | 0.03 |
| Max Grad Norm | 0.3 |
| Precision | bfloat16 |
| Gradient Checkpointing | Enabled |
| Seed | 42 |
Infrastructure
| Property | Value |
|---|---|
| GPU | NVIDIA L4 (24 GB VRAM) |
| Cloud Platform | Modal serverless GPU |
| Training Time | ~60-90 minutes |
Prompt Format
Input:
Analyze this chest X-ray and identify any findings.
Output:
This chest X-ray shows Pneumonia.
Pneumonia (infectious consolidation of lung parenchyma with air bronchograms).
Usage
from transformers import AutoProcessor, AutoModelForImageTextToText
from peft import PeftModel
from PIL import Image
base_model_id = "google/medgemma-4b-it"
adapter_id = "efecelik/medgemma-chest-xray-lora"
processor = AutoProcessor.from_pretrained(base_model_id)
model = AutoModelForImageTextToText.from_pretrained(
base_model_id, torch_dtype="bfloat16", device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter_id)
image = Image.open("chest_xray.jpg").convert("RGB")
messages = [
{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": "Analyze this chest X-ray and identify any findings."}
]}
]
inputs = processor.apply_chat_template(
messages, add_generation_prompt=True, tokenize=True,
return_dict=True, return_tensors="pt", images=[image]
).to(model.device)
output = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(output[0], skip_special_tokens=True))
Intended Use
This adapter is part of the MedVision AI platform built for the MedGemma Impact Challenge. It is designed for:
- Medical education: Helping students learn systematic chest X-ray interpretation
- Clinical decision support: Assisting radiologists with thoracic disease screening
- Research: Exploring fine-tuned medical VLMs for chest radiography
Limitations
- Not for clinical diagnosis. This model is for educational and research purposes only.
- Label noise: NIH ChestX-ray14 labels were NLP-extracted from reports and contain noise (~10-30% error rate depending on pathology).
- Curated subset: Trained on 10K of 112K available images.
- Single view: Trained on frontal views only. Lateral views not included.
- Single epoch: Trained for 1 epoch; further training may improve performance.
Citation
@inproceedings{wang2017chestx,
title={Chestx-ray8: Hospital-scale chest x-ray database and benchmarks},
author={Wang, Xiaosong and Peng, Yifan and Lu, Le and Lu, Zhiyong and Bagheri, Mohammadhadi and Summers, Ronald M},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={2097--2106},
year={2017}
}
Disclaimer
This model is for educational and research purposes only. It is NOT intended for clinical diagnosis or patient care decisions. Always consult qualified medical professionals for medical advice.
- Downloads last month
- 16