Instructions to use DeepRadiology/medgemma1.5-CXR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use DeepRadiology/medgemma1.5-CXR with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/medgemma-1.5-4b-it") model = PeftModel.from_pretrained(base_model, "DeepRadiology/medgemma1.5-CXR") - Transformers
How to use DeepRadiology/medgemma1.5-CXR with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="DeepRadiology/medgemma1.5-CXR") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("DeepRadiology/medgemma1.5-CXR", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use DeepRadiology/medgemma1.5-CXR with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DeepRadiology/medgemma1.5-CXR" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DeepRadiology/medgemma1.5-CXR", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/DeepRadiology/medgemma1.5-CXR
- SGLang
How to use DeepRadiology/medgemma1.5-CXR with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "DeepRadiology/medgemma1.5-CXR" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DeepRadiology/medgemma1.5-CXR", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "DeepRadiology/medgemma1.5-CXR" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DeepRadiology/medgemma1.5-CXR", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use DeepRadiology/medgemma1.5-CXR with Docker Model Runner:
docker model run hf.co/DeepRadiology/medgemma1.5-CXR
Model Card for medgemma1.5-CXR
medgemma1.5-CXR is my second attempt at fine tuning an open-weights vision-language model for chest X-ray structured report generation (for first attempt, please see Llama-3.2-11B-CXR. The model has been fine-tuned to generate radiological reports in a structured JSON format.
{
"Support devices": "None.",
"Cardiomediastinum": "Within normal limits.",
"Lungs": "Lungs are clear.",
"Pleura": "No pleural effusion or pneumothorax.",
"Skeleton": "No acute findings.",
"Upper abdomen": "No acute findings."
}
Model Details
Model Description
These are adapters for google/medgemma-1.5-4b-it, obtained through supervised fine-tuning (SFT) with low-rank adapters (LoRA) using a custom subset of publicly available frontal chest x-rays from the romprr/CXR_BioXAi_Hackathon_2024 dataset.
- Developed, funded and shared by: Nakul Gupta
- Model type: Multi-modal Large Language Model
- Language(s) (NLP): SFT was done in English language, although base model supports additional languages.
- License: The use of MedGemma is governed by the Health AI Developer Foundations terms of use..
- Finetuned from model: google/medgemma-1.5-4b-it
Uses
This model is SOLELY intended for research and development purposes. It is by no means ready or meant for clinical use, nor has it been validated in a clinical setting.
Out-of-Scope Use
This model has NOT been validated for clinical use or evaluated by any regulatory bodies and may experience hallucinations as well as missed findings. It is intended for research and developmental use ONLY. The models outputs are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications. All model outputs require independent verification and further investigation through established scientific research and development methodologies.
Bias, Risks, and Limitations
Results and model outputs are heavily dependent upon the specific prompt/instruction as well as inferencing techniques (temperature, top_p, min_p, etc.). The model has been optimized only for single-turn, single-image evaluation. The model may also suffer from data contamination/leakage, where the model may have been exposed to evaluation data during pre-training of fine-tuning, which may lead to overestimation of its true capabilities. Therefore, the model requires validation on datasets specific to each individual's/institutions use case.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
import torch
from PIL import Image
from transformers import AutoProcessor, AutoModelForVision2Seq
base_model = "google/medgemma-1.5-4b-it"
adapter_id = "DeepRadiology/medgemma-1.5-4b-it"
model = AutoModelForVision2Seq.from_pretrained(
base_model,
device_map='auto',
torch_dtype=torch.bfloat16,
)
adapter_name = model.load_adapter(adapter_id)
model.active_adapters = adapter_name
processor = AutoProcessor.from_pretrained(base_model)
image = Image.open("cxr.jpeg") # replace with your own example image
instruction = """You are an expert chest radiologist. Describe accurately what you see in this image. Use a \
structured report template with fields for: Support devices, Cardiomediastinum, Lungs, Pleura, Skeleton, and Upper \
abdomen. If there are no support devices, then report "None." for that field, if there are no pertinent \
Cardiomediastinal findings, report "Within normal limits." for that field. If there are no abnormal lung findings \
report "Lungs are clear." If there are no pertinent pleural findings, report "No pleural effusion or pneumothorax." \
For all other fields, if there are no pertinent findings, report "No acute findings." You must always generate a report\
with the required fields."""
messages = [
{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": instruction}
]}
]
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(
image,
input_text,
add_special_tokens=False,
return_tensors="pt"
).to(model.device)
output = model.generate(**inputs, max_new_tokens=256, temperature=0.7, min_p=0.1)
print(processor.decode(output[0]))
Training Details
Training Data
romprr/CXR_BioXAi_Hackathon_2024.
Training Procedure
Preprocessing
Dataset was filtered using meta-llama/Llama-3.3-70B-Instruct to remove reports with references to priors (although this was not 100% successful). The remaining free-text reports were then converted into a structured report format, again using meta-llama/Llama-3.3-70B-Instruct. The final training set was approximately 33k x-rays.
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
Testing Data, Factors & Metrics
Testing Data
Evaluation was performed using publically available IU-Xray and MIMIC-CXR datasets, using 'test' splits and frontal x-rays only as defined by RexRank.
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Presenting Llama-3.2-CXR-11B, a multi-modal open-weights vision language model (VLM) fine-tuned for chest x-ray report generation! The primary goal of this exercise was to demonstrate the potential for general purpose VLM's to be re-purposed for medical imaging tasks on consumer grade hardware with publicly available datasets.
Data Citations
MIMIC-CXR:
Johnson, A., Pollard, T., Mark, R., Berkowitz, S., & Horng, S. (2019). MIMIC-CXR Database (version 2.0.0). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/C2JT1Q
Johnson, A.E.W., Pollard, T.J., Berkowitz, S.J. et al. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci Data 6, 317 (2019). https://doi.org/10.1038/s41597-019-0322-0
IU-Xray:
Demner-Fushman D, Kohli MD, Rosenman MB, Shooshan SE, Rodriguez L, Antani S, Thoma GR, McDonald CJ. Preparing a collection of radiology examinations for distribution and retrieval. J Am Med Inform Assoc. 2016 Mar;23(2):304-10. doi: 10.1093/jamia/ocv080. Epub 2015 Jul 1. PMID: 26133894; PMCID: PMC5009925.
Model Card Contact
- Downloads last month
- 17
Model tree for DeepRadiology/medgemma1.5-CXR
Base model
google/medgemma-1.5-4b-it