MedGemma Mobile (Kaggle Edition) π₯π±
Optimized Medical VLM for Mobile Devices
This repository contains the TensorFlow Lite (TFLite) quantized models for running MedGemma on mobile devices (Android/iOS). These models are optimized for efficiency while retaining medical reasoning capabilities.
π GitHub Repository: medgemma-mobile (Run the demo & source code)
π¦ Models Included
| Component | Filename | Description | Size |
|---|---|---|---|
| Text Model | models/text/medgemma_4b_mobile_int8_q8_ekv2048.tflite |
4B param LLM, INT8/Int4 quantization, 2048 context window. | ~3.9 GB |
| Vision | models/vision/medsiglip_vision_448.tflite |
SigLIP vision encoder (448x448 input). | ~1.6 GB |
| Projector | models/projector/multimodal_projector_448.tflite |
Linear projection layer. | ~11 MB |
| Tokenizer | tokenizers/medgemma_4b/ |
Tokenizer files (SPM). | - |
π Quick Start (Python)
To run these models, clone the GitHub repository and use the provided demo script.
# Clone the demo code
git clone https://github.com/itikelabhaskar/medgemma_kaggle
cd medgemma_kaggle
# Install dependencies
pip install -r requirements.txt
# Run the demo (Download models from this repo first!)
python demo/run_demo.py --image demo/images/image1.jpg
π± Android Integration
These models are compatible with ai-edge-litert (TensorFlow Lite Runtime) for Android.
- Vision Model: Input
[1, 3, 448, 448], Output[1, 1024, 1152] - Projector: Input
[1, 1024, 1152], Output[1, 1024, 2560] - Text Model: Input
[1, seq, 2560](embeddings) or Token IDs.
See the GitHub repo for implementation details.
β οΈ Requirements
- RAM: Minimum 8GB RAM (Unified) recommended for the 4B Text Model.
- OS: Linux/WSL recommended for execution (Windows TFLite has file mapping limits >2GB).
Credits
- Based on Google Gemma 2 and SigLIP.
- Fine-tuned for medical VQA tasks.
- Downloads last month
- 30
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support