Instructions to use DeepRadiology/medgemma1.5-CXR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DeepRadiology/medgemma1.5-CXR with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("google/medgemma-1.5-4b-it")
model = PeftModel.from_pretrained(base_model, "DeepRadiology/medgemma1.5-CXR")

Transformers

How to use DeepRadiology/medgemma1.5-CXR with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="DeepRadiology/medgemma1.5-CXR")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("DeepRadiology/medgemma1.5-CXR", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use DeepRadiology/medgemma1.5-CXR with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DeepRadiology/medgemma1.5-CXR"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DeepRadiology/medgemma1.5-CXR",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/DeepRadiology/medgemma1.5-CXR

SGLang

How to use DeepRadiology/medgemma1.5-CXR with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "DeepRadiology/medgemma1.5-CXR" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DeepRadiology/medgemma1.5-CXR",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "DeepRadiology/medgemma1.5-CXR" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DeepRadiology/medgemma1.5-CXR",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use DeepRadiology/medgemma1.5-CXR with Docker Model Runner:
```
docker model run hf.co/DeepRadiology/medgemma1.5-CXR
```

Model Card for medgemma1.5-CXR

medgemma1.5-CXR is my second attempt at fine tuning an open-weights vision-language model for chest X-ray structured report generation (for first attempt, please see Llama-3.2-11B-CXR. The model has been fine-tuned to generate radiological reports in a structured JSON format.

{
"Support devices": "None.",
"Cardiomediastinum": "Within normal limits.",
"Lungs": "Lungs are clear.",
"Pleura": "No pleural effusion or pneumothorax.",
"Skeleton": "No acute findings.",
"Upper abdomen": "No acute findings."
}

Model Details

Model Description

These are adapters for google/medgemma-1.5-4b-it, obtained through supervised fine-tuning (SFT) with low-rank adapters (LoRA) using a custom subset of publicly available frontal chest x-rays from the romprr/CXR_BioXAi_Hackathon_2024 dataset.

Developed, funded and shared by: Nakul Gupta
Model type: Multi-modal Large Language Model
Language(s) (NLP): SFT was done in English language, although base model supports additional languages.
License: The use of MedGemma is governed by the Health AI Developer Foundations terms of use..
Finetuned from model: google/medgemma-1.5-4b-it

Uses

This model is SOLELY intended for research and development purposes. It is by no means ready or meant for clinical use, nor has it been validated in a clinical setting.

Out-of-Scope Use

This model has NOT been validated for clinical use or evaluated by any regulatory bodies and may experience hallucinations as well as missed findings. It is intended for research and developmental use ONLY. The models outputs are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications. All model outputs require independent verification and further investigation through established scientific research and development methodologies.

Bias, Risks, and Limitations

Results and model outputs are heavily dependent upon the specific prompt/instruction as well as inferencing techniques (temperature, top_p, min_p, etc.). The model has been optimized only for single-turn, single-image evaluation. The model may also suffer from data contamination/leakage, where the model may have been exposed to evaluation data during pre-training of fine-tuning, which may lead to overestimation of its true capabilities. Therefore, the model requires validation on datasets specific to each individual's/institutions use case.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

import torch
from PIL import Image
from transformers import AutoProcessor, AutoModelForVision2Seq

base_model = "google/medgemma-1.5-4b-it"
adapter_id = "DeepRadiology/medgemma-1.5-4b-it"

model = AutoModelForVision2Seq.from_pretrained(
    base_model,
    device_map='auto',
    torch_dtype=torch.bfloat16,
)

adapter_name = model.load_adapter(adapter_id)
model.active_adapters = adapter_name
processor = AutoProcessor.from_pretrained(base_model)
image = Image.open("cxr.jpeg") # replace with your own example image

instruction = """You are an expert chest radiologist. Describe accurately what you see in this image. Use a \
structured report template with fields for: Support devices, Cardiomediastinum, Lungs, Pleura, Skeleton, and Upper \
abdomen. If there are no support devices, then report "None." for that field, if there are no pertinent \
Cardiomediastinal findings, report "Within normal limits." for that field. If there are no abnormal lung findings \
report "Lungs are clear." If there are no pertinent pleural findings, report "No pleural effusion or pneumothorax." \
For all other fields, if there are no pertinent findings, report "No acute findings." You must always generate a report\
 with the required fields."""

messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(
    image,
    input_text,
    add_special_tokens=False,
    return_tensors="pt"
).to(model.device)

output = model.generate(**inputs, max_new_tokens=256, temperature=0.7, min_p=0.1)
print(processor.decode(output[0]))

Training Details

Training Data

romprr/CXR_BioXAi_Hackathon_2024.

Training Procedure

Preprocessing

Dataset was filtered using meta-llama/Llama-3.3-70B-Instruct to remove reports with references to priors (although this was not 100% successful). The remaining free-text reports were then converted into a structured report format, again using meta-llama/Llama-3.3-70B-Instruct. The final training set was approximately 33k x-rays.

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluation was performed using publically available IU-Xray and MIMIC-CXR datasets, using 'test' splits and frontal x-rays only as defined by RexRank.

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Presenting Llama-3.2-CXR-11B, a multi-modal open-weights vision language model (VLM) fine-tuned for chest x-ray report generation! The primary goal of this exercise was to demonstrate the potential for general purpose VLM's to be re-purposed for medical imaging tasks on consumer grade hardware with publicly available datasets.

Data Citations

MIMIC-CXR:

Johnson, A., Pollard, T., Mark, R., Berkowitz, S., & Horng, S. (2019). MIMIC-CXR Database (version 2.0.0). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/C2JT1Q

Johnson, A.E.W., Pollard, T.J., Berkowitz, S.J. et al. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci Data 6, 317 (2019). https://doi.org/10.1038/s41597-019-0322-0

IU-Xray:

Demner-Fushman D, Kohli MD, Rosenman MB, Shooshan SE, Rodriguez L, Antani S, Thoma GR, McDonald CJ. Preparing a collection of radiology examinations for distribution and retrieval. J Am Med Inform Assoc. 2016 Mar;23(2):304-10. doi: 10.1093/jamia/ocv080. Epub 2015 Jul 1. PMID: 26133894; PMCID: PMC5009925.

Model Card Contact

Nakul Gupta

Downloads last month: 17

Model tree for DeepRadiology/medgemma1.5-CXR

Base model

google/medgemma-1.5-4b-it

Adapter

(45)

this model

DeepRadiology
/

medgemma1.5-CXR

Model Card for medgemma1.5-CXR

Model Details

Model Description

Uses

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing

Training Hyperparameters

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Metrics

Results

Summary

Data Citations

Model Card Contact

Model tree for DeepRadiology/medgemma1.5-CXR

Dataset used to train DeepRadiology/medgemma1.5-CXR