convaiinnovations
/

medgemma-4b-ecginstruct

@@ -19,26 +19,32 @@ pipeline_tag: image-text-to-text
 # MedGemma-4B ECGInstruct
 Fine-tuned version of Google's MedGemma-4B-it model on the ECGInstruct dataset for automated ECG interpretation.
 ## Model Description
 This is a fully merged fine-tuned version of [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it) trained on the [PULSE-ECG/ECGInstruct](https://huggingface.co/datasets/PULSE-ECG/ECGInstruct) dataset containing 1.15M ECG instruction-following examples. The LoRA adapter has been merged into the base model for easier deployment.
-**Developed by:** convaiinnovations
 **Base Model:** google/medgemma-4b-it
 **Training Infrastructure:** AIRAWAT (C-DAC) - 8x NVIDIA A100 40GB GPUs
-**Training Duration:** ~40 hours (1 epoch)
-**Final Token Accuracy:** 88.23%
 **Model Size:** ~8.5 GB
 ## Training Details
 ### Training Data
-- **Dataset:** PULSE-ECG/ECGInstruct
-- **Samples:** 1,154,110 training examples
-- **Image Sources:** MIMIC-IV-ECG, PTB-XL, CODE-15%, and other ECG datasets
 - **Task:** Vision-language instruction following for ECG interpretation
 ### Training Procedure
@@ -48,17 +54,20 @@ This is a fully merged fine-tuned version of [google/medgemma-4b-it](https://hug
 **Training Configuration:**
 - Fine-tuning method: LoRA (r=32, alpha=64, dropout=0.05)
-- Learning rate: 2e-4 with cosine decay
-- Batch size: 128 (effective)
 - Optimizer: AdamW (fused)
 - Precision: bfloat16
 - Gradient checkpointing: Enabled
 **Training Metrics:**
-- Final training loss: 10.997
-- Mean token accuracy: 88.23%
-- Entropy: 1.796
-- Total tokens processed: 253,325,537
 ## Usage
@@ -150,9 +159,22 @@ This model can:
 ## Performance
-**Token-level Accuracy:** 88.23%
-**Training Loss:** 10.997
-**Inference Speed:** ~2-3 seconds per ECG on A100 GPU
 ## Limitations
@@ -165,7 +187,16 @@ This model can:
 ## Ethical Considerations
-⚠️ **Medical Disclaimer:** This model is intended for research and educational purposes only. It should **NOT** be used as a substitute for professional medical advice, diagnosis, or treatment. Always consult qualified healthcare professionals for medical decisions.
 **Important Notes:**
 - This is an AI model and can make mistakes
@@ -173,6 +204,7 @@ This model can:
 - Model outputs should be verified by trained clinicians
 - Not approved for clinical use or diagnostic purposes
 - Use responsibly and within appropriate medical oversight
 ## Intended Use

 # MedGemma-4B ECGInstruct
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/19VGxD03skunSLLRe7gIMs_zHMj9_TolQ?usp=sharing)
 Fine-tuned version of Google's MedGemma-4B-it model on the ECGInstruct dataset for automated ECG interpretation.
 ## Model Description
 This is a fully merged fine-tuned version of [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it) trained on the [PULSE-ECG/ECGInstruct](https://huggingface.co/datasets/PULSE-ECG/ECGInstruct) dataset containing 1.15M ECG instruction-following examples. The LoRA adapter has been merged into the base model for easier deployment.
+**Developed by:** ConvAI Innovations
 **Base Model:** google/medgemma-4b-it
 **Training Infrastructure:** AIRAWAT (C-DAC) - 8x NVIDIA A100 40GB GPUs
+**Training Duration:** 72 hours (3 days)
+**Final Token Accuracy:** 86.83%
+**Final Training Loss:** 0.6188
+**GPU-Hours:** 576
 **Model Size:** ~8.5 GB
 ## Training Details
 ### Training Data
+- **Dataset:** PULSE-ECG/ECGInstruct (1.15M samples)
+- **Samples:** 1,156,110 ECG image-text pairs
+- **Image Sources:** MIMIC-IV-ECG (~800K), PTB-XL (22K), CODE-15% (346K), ChapmanShaoxing
 - **Task:** Vision-language instruction following for ECG interpretation
+- **Demographics:** Age range 0-95 years, 52% male / 48% female
+- **Disease Classes:** 5 superclasses (NORM, MI, STTC, CD, HYP), 24 subclasses
 ### Training Procedure
 **Training Configuration:**
 - Fine-tuning method: LoRA (r=32, alpha=64, dropout=0.05)
+- Target modules: all-linear (including vision encoder)
+- Learning rate: 1.2e-5 with cosine decay
+- Batch size: 192 effective (4 per GPU × 8 GPUs × 6 gradient accumulation)
 - Optimizer: AdamW (fused)
 - Precision: bfloat16
 - Gradient checkpointing: Enabled
+- Max sequence length: 2048 tokens
+- Max new tokens: 512
 **Training Metrics:**
+- Final training loss: 0.6188
+- Mean token accuracy: 86.83%
+- Training throughput: ~9.67 samples/sec
+- Total tokens processed: 103M+
 ## Usage
 ## Performance
+**Training Metrics:**
+| Metric | Value |
+|--------|-------|
+| Token Accuracy | 86.83% |
+| Final Loss | 0.6188 |
+| Training Time | 72 hours |
+| GPU-Hours | 576 |
+**Inference Metrics (A100 GPU):**
+| Metric | Value |
+|--------|-------|
+| TTFT (Time to First Token) | ~150ms |
+| ISL (Input Sequence Length) | 2048 tokens |
+| OSL (Output Sequence Length) | 512 tokens |
+| End-to-End Latency | 2-3 seconds |
+| Throughput | ~45 tokens/sec |
 ## Limitations
 ## Ethical Considerations
+> ⚠️ **MEDICAL DISCLAIMER**
+>
+> **This model is for RESEARCH AND EDUCATIONAL PURPOSES ONLY.**
+>
+> - ❌ NOT validated for clinical use
+> - ❌ NOT FDA/CE approved
+> - ❌ NOT a substitute for professional medical diagnosis
+> - ❌ Should NOT be used for patient care decisions
+>
+> **Always consult qualified healthcare professionals for medical decisions.**
 **Important Notes:**
 - This is an AI model and can make mistakes
 - Model outputs should be verified by trained clinicians
 - Not approved for clinical use or diagnostic purposes
 - Use responsibly and within appropriate medical oversight
+- Has not been tested on external clinical datasets
 ## Intended Use