MedGemma 4B GGUF - Quantized for African Healthcare
Quantized versions of google/medgemma-1.5-4b-it optimized for on-device medical AI in resource-constrained settings.
Available Models
| File | Quantization | Size | RAM | Use Case |
|---|---|---|---|---|
medgemma-4b-iq2_xs.gguf |
IQ2_XS + Medical imatrix | ~0.9GB | ~2GB | Budget phones (<$80) |
medgemma-4b-q2_k.gguf |
Q2_K (2-bit) | ~1.4GB | ~2.5GB | Standard budget phones |
medgemma-4b-q4_k_m.gguf |
Q4_K_M (4-bit) | ~2.4GB | ~4GB | Mid-range phones |
Medical Importance Matrix
The IQ2_XS model was quantized using a custom importance matrix (imatrix) calibrated on:
- African primary care scenarios (malaria, typhoid, cholera, respiratory infections)
- Maternal and child health (pregnancy complications, childhood diarrhea, nutrition)
- Emergency triage (snake bites, severe dehydration, trauma)
- Multi-language symptoms (Twi, Hausa, Yoruba, English)
This preserves medical diagnostic accuracy while aggressively compressing general knowledge.
Usage with llama.cpp
./llama-cli -m medgemma-4b-iq2_xs.gguf -p "Patient has fever, chills, and headache for 3 days. What could this be?"
License
Subject to Gemma Terms of Use.
Part of the Nku Project
Built for the Google MedGemma Impact Challenge - bringing AI-powered healthcare to underserved African communities.
- Downloads last month
- 378
Hardware compatibility
Log In
to add your hardware
1-bit
2-bit
4-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for wredd/medgemma-4b-gguf
Base model
google/medgemma-1.5-4b-it