File size: 5,579 Bytes
42d7be4 4e044c9 1e56d77 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 913df95 b7f07a7 4e044c9 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 7bee588 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 42d7be4 b7f07a7 4e044c9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 | ---
base_model: google/medgemma-1.5-4b-it
library_name: peft
tags:
- base_model:adapter:google/medgemma-1.5-4b-it
- lora
- transformers
- medical
- dermatology
- multimodal
- vision-language
- change-detection
- temporal-analysis
license: other
datasets:
- dunktra/dermacheck-temporal-pairs
language:
- en
metrics:
- f1
- precision
- accuracy
- recall
pipeline_tag: image-text-to-text
---
# MedGemma Temporal Change Detection (LoRA Adapter)
This repository provides **LoRA adapters** fine-tuned on top of **google/medgemma-1.5-4b-it** for exploring **temporal change detection in dermatoscopic image pairs**.
The project investigates whether lightweight parameter-efficient fine-tuning can adapt a multimodal medical foundation model to a **novel temporal reasoning task**.
## Model Details
### Model Description
This repository contains LoRA adapters only, not a full model checkpoint.
- **Developed and shared by:** Dung Claire Tran ([@dunktra](https://huggingface.co/dunktra))
- **Base Model:** [google/medgemma-1.5-4b-it](https://huggingface.co/google/medgemma-1.5-4b-it)
- **Fine-Tuning Method:** LoRA (Low-Rank Adaptation, PEFT)
- **Model type:** Vision–Language Model (VLM) adapter
- **Task:** Binary classification of temporal change in skin lesion image pairs
- **Dataset:** [dunktra/dermacheck-temporal-pairs](https://huggingface.co/datasets/dunktra/dermacheck-temporal-pairs) (synthetic temporal pairs)
- **Language(s) (NLP):** English
- **License:** Inherits license from google/medgemma-1.5-4b-it
### Model Sources
- **Repository:** [Kaggle notebook (training & evaluation)](https://www.kaggle.com/code/dungclairetran/dermacheck-medgemma-lora-fine-tuning)
## Uses
### Direct Use
- Research and experimentation with **temporal reasoning in medical imaging**
- Evaluation of **LoRA fine-tuning feasibility** on multimodal medical foundation models
- Educational and benchmarking purposes
### Out-of-Scope Use
- Clinical diagnosis or medical decision-making
- Deployment in real-world healthcare settings without clinical validation
This model is **not a medical device**.
## Limitations
- Fine-tuning effects may not surface when using **keyword-based label extraction**
- Binary classification may mask improvements in:
- reasoning structure
- explanatory language
- uncertainty expression
- Synthetic temporal data limits real-world generalization
- Inherits all limitations of the base MedGemma model
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
## How to Get Started with the Model
Use the code below to get started with the model.
```
from transformers import AutoModelForVision2Seq, AutoProcessor
from peft import PeftModel
import torch
base_model = AutoModelForVision2Seq.from_pretrained(
"google/medgemma-1.5-4b-it",
torch_dtype=torch.bfloat16,
device_map="auto"
)
model = PeftModel.from_pretrained(
base_model,
"dunktra/medgemma-temporal-lora"
)
processor = AutoProcessor.from_pretrained(
"dunktra/medgemma-temporal-lora"
)
```
## Training Details
### Training Data
- **Source:** [dunktra/dermacheck-temporal-pairs](https://huggingface.co/datasets/dunktra/dermacheck-temporal-pairs)
- **Description:** Synthetic before/after dermatoscopic image pairs labeled for temporal change
- **Splits:**
- **Training:** ~630 pairs
- **Validation:** ~135 pairs
- **Test:** 135 pairs
**Note:** *The dataset consists of **synthetic temporal pairs**, not real longitudinal patient data.*
### Training Configuration
- **LoRA Rank (r):** 8
- **LoRA Alpha:** 16
- **Target Modules:** q_proj, k_proj, v_proj, o_proj
- **LoRA Dropout:** 0.05
- **Epochs:** 3
- **Effective Batch Size:** 16
- **Learning Rate:** 2e-4
- **Precision:** bfloat16
- **Frameworks:** Transformers + PEFT
## Evaluation
#### Metrics
- Precision
- Recall
- F1 score (binary classification)
### Results (Test Set: 135 temporal pairs)
| Metric | Base MedGemma | Fine-Tuned (LoRA) | Change |
|------------|---------------|-------------------|--------|
| F1 Score | 0.8797 | 0.8797 | +0.00% |
| Precision | 0.7852 | 0.7852 | +0.00% |
| Recall | 1.0000 | 1.0000 | +0.00% |
LoRA fine-tuning **did not** yield measurable improvements under the current evaluation protocol.
**Note:** Although LoRA fine-tuning did not improve aggregate F1 on the held-out test set, analysis revealed that both the base and fine-tuned models collapsed to a high-recall regime, predicting “change” for all examples. This indicates that the primary performance bottleneck lies in task framing and decision extraction rather than model capacity. The experiment demonstrates stable LoRA adaptation without regression and highlights the importance of evaluation design in generative medical VLMs.
### Qualitative Analysis
- No test cases were found where the fine-tuned model corrected errors made by the base model.
- Fine-tuning did not alter binary decision outcomes given the current response-parsing heuristic.
## License
- This adapter inherits the license and usage restrictions of:
- **google/medgemma-1.5-4b-it**
- Underlying datasets used by the base model
- Non-commercial research use only.
## Acknowledgements
- Google MedGemma team
- PEFT / Hugging Face ecosystem
*Created for the **MedGemma Impact Challenge 2026 – Novel Task Exploration**.*
## Model Card Contact
[dunktra](https://huggingface.co/dunktra)
### Framework versions
- PEFT 0.18.1 |