Instructions to use convaiinnovations/medgemma-4b-ecginstruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use convaiinnovations/medgemma-4b-ecginstruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="convaiinnovations/medgemma-4b-ecginstruct")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("convaiinnovations/medgemma-4b-ecginstruct")
model = AutoModelForImageTextToText.from_pretrained("convaiinnovations/medgemma-4b-ecginstruct")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use convaiinnovations/medgemma-4b-ecginstruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "convaiinnovations/medgemma-4b-ecginstruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "convaiinnovations/medgemma-4b-ecginstruct",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/convaiinnovations/medgemma-4b-ecginstruct

SGLang

How to use convaiinnovations/medgemma-4b-ecginstruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "convaiinnovations/medgemma-4b-ecginstruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "convaiinnovations/medgemma-4b-ecginstruct",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "convaiinnovations/medgemma-4b-ecginstruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "convaiinnovations/medgemma-4b-ecginstruct",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use convaiinnovations/medgemma-4b-ecginstruct with Docker Model Runner:
```
docker model run hf.co/convaiinnovations/medgemma-4b-ecginstruct
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

MedGemma-4B ECGInstruct

Fine-tuned version of Google's MedGemma-4B-it model on the ECGInstruct dataset for automated ECG interpretation.

Model Description

This is a fully merged fine-tuned version of google/medgemma-4b-it trained on the PULSE-ECG/ECGInstruct dataset containing 1.15M ECG instruction-following examples. The LoRA adapter has been merged into the base model for easier deployment.

Developed by: ConvAI Innovations
Base Model: google/medgemma-4b-it
Training Infrastructure: AIRAWAT (C-DAC) - 8x NVIDIA A100 40GB GPUs
Training Duration: 72 hours (3 days)
Final Token Accuracy: 86.83%
Final Training Loss: 0.6188
GPU-Hours: 576
Model Size: ~8.5 GB

Training Details

Training Data

Dataset: PULSE-ECG/ECGInstruct (1.15M samples)
Samples: 1,156,110 ECG image-text pairs
Image Sources: MIMIC-IV-ECG (~800K), PTB-XL (22K), CODE-15% (346K), ChapmanShaoxing
Task: Vision-language instruction following for ECG interpretation
Demographics: Age range 0-95 years, 52% male / 48% female
Disease Classes: 5 superclasses (NORM, MI, STTC, CD, HYP), 24 subclasses

Training Procedure

Hardware:

8x NVIDIA A100 40GB GPUs (AIRAWAT supercomputer)
Distributed training with PyTorch DDP

Training Configuration:

Fine-tuning method: LoRA (r=32, alpha=64, dropout=0.05)
Target modules: all-linear (including vision encoder)
Learning rate: 1.2e-5 with cosine decay
Batch size: 192 effective (4 per GPU × 8 GPUs × 6 gradient accumulation)
Optimizer: AdamW (fused)
Precision: bfloat16
Gradient checkpointing: Enabled
Max sequence length: 2048 tokens
Max new tokens: 512

Training Metrics:

Final training loss: 0.6188
Mean token accuracy: 86.83%
Training throughput: ~9.67 samples/sec
Total tokens processed: 103M+

Usage

Installation

pip install transformers pillow torch

Loading the Model

from transformers import AutoModelForImageTextToText, AutoProcessor
from PIL import Image

# Load model and processor
model_id = "convaiinnovations/medgemma-4b-ecginstruct"
model = AutoModelForImageTextToText.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)
processor = AutoProcessor.from_pretrained(model_id)

Inference Example

# Load ECG image
image = Image.open("ecg_image.png").convert("RGB")

# Prepare prompt using chat template
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "Analyze this ECG and provide a detailed interpretation."}
        ]
    }
]

# Process input
text = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = processor(text=[text], images=[[image]], return_tensors="pt", padding=True)
inputs = {k: v.to(model.device) for k, v in inputs.items()}

# Generate interpretation
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    do_sample=False
)

# Decode and print
response = processor.decode(outputs[0], skip_special_tokens=True)
print(response)

Example Prompts

# Detailed interpretation
"Analyze this ECG and provide a detailed interpretation."

# Specific abnormality detection
"What abnormalities are present in this ECG?"

# Diagnosis suggestion
"Based on this ECG, what is the most likely diagnosis?"

# Question answering
"Does this ECG show signs of atrial fibrillation?"

# Rate and rhythm
"What is the heart rate and rhythm in this ECG?"

Model Capabilities

This model can:

✅ Interpret 12-lead ECG images
✅ Identify cardiac abnormalities (arrhythmias, ischemia, hypertrophy, conduction blocks, etc.)
✅ Generate detailed clinical reports
✅ Answer specific questions about ECG findings
✅ Provide diagnostic suggestions
✅ Assess heart rate, rhythm, and axis
✅ Detect ST-segment changes and T-wave abnormalities

Performance

Training Metrics:

Metric	Value
Token Accuracy	86.83%
Final Loss	0.6188
Training Time	72 hours
GPU-Hours	576

Inference Metrics (A100 GPU):

Metric	Value
TTFT (Time to First Token)	~150ms
ISL (Input Sequence Length)	2048 tokens
OSL (Output Sequence Length)	512 tokens
End-to-End Latency	2-3 seconds
Throughput	~45 tokens/sec

Limitations

Trained primarily on adult ECG data
Performance may vary on pediatric ECGs
Should not replace professional medical diagnosis
Requires high-quality ECG images for optimal results
May struggle with very rare or unusual ECG patterns
Limited to English language outputs

Ethical Considerations

⚠️ MEDICAL DISCLAIMER

This model is for RESEARCH AND EDUCATIONAL PURPOSES ONLY.

❌ NOT validated for clinical use

❌ NOT FDA/CE approved

❌ NOT a substitute for professional medical diagnosis

❌ Should NOT be used for patient care decisions

Always consult qualified healthcare professionals for medical decisions.

Important Notes:

This is an AI model and can make mistakes
ECG interpretation requires clinical context
Model outputs should be verified by trained clinicians
Not approved for clinical use or diagnostic purposes
Use responsibly and within appropriate medical oversight
Has not been tested on external clinical datasets

Intended Use

Appropriate Uses:

Research in medical AI and computer vision
Educational demonstrations of ECG interpretation
Development of clinical decision support prototypes
Benchmarking ECG analysis algorithms

Inappropriate Uses:

Direct patient diagnosis without physician review
Replacement of trained medical professionals
Use in emergency or critical care settings without oversight
Commercial deployment without proper validation

Citation

If you use this model in your research, please cite:

@misc{medgemma-ecginstruct,
  author = {convaiinnovations},
  title = {MedGemma-4B ECGInstruct: Fine-tuned ECG Interpretation Model},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/convaiinnovations/medgemma-4b-ecginstruct}}
}

Acknowledgments

Base Model: Google's MedGemma team for the foundation model
Dataset: PULSE-ECG team for the ECGInstruct dataset
Infrastructure: AIRAWAT AI Innovation Challenge by C-DAC (Centre for Development of Advanced Computing)
Frameworks: HuggingFace Transformers, PEFT, TRL, PyTorch

Related Resources

LoRA Adapter Version: convaiinnovations/medgemma-4b-ecginstruct-lora
Base Model: google/medgemma-4b-it
Training Dataset: PULSE-ECG/ECGInstruct

License

Apache 2.0 (following base model license)

Contact

For questions or issues, please open an issue on the model repository or contact the maintainers.

Downloads last month: 21

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for convaiinnovations/medgemma-4b-ecginstruct

Base model

google/gemma-3-4b-pt

Finetuned

google/medgemma-4b-pt

Finetuned

google/medgemma-4b-it

Finetuned

(600)

this model

Quantizations

1 model

convaiinnovations
/

medgemma-4b-ecginstruct

MedGemma-4B ECGInstruct

Model Description

Training Details

Training Data

Training Procedure

Usage

Installation

Loading the Model

Inference Example

Example Prompts

Model Capabilities

Performance

Limitations

Ethical Considerations

Intended Use

Citation

Acknowledgments

Related Resources

License

Contact

Model tree for convaiinnovations/medgemma-4b-ecginstruct

Dataset used to train convaiinnovations/medgemma-4b-ecginstruct