File size: 5,579 Bytes
42d7be4
4e044c9
 
 
 
 
 
1e56d77
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42d7be4
 
b7f07a7
42d7be4
b7f07a7
 
42d7be4
 
 
 
 
 
 
b7f07a7
42d7be4
b7f07a7
 
 
 
 
913df95
b7f07a7
 
4e044c9
b7f07a7
42d7be4
b7f07a7
42d7be4
 
 
 
 
 
 
b7f07a7
 
 
42d7be4
 
 
 
b7f07a7
 
42d7be4
b7f07a7
42d7be4
b7f07a7
42d7be4
b7f07a7
 
 
 
 
 
 
42d7be4
 
 
 
 
 
 
b7f07a7
 
 
 
42d7be4
b7f07a7
 
 
 
 
42d7be4
b7f07a7
 
 
 
42d7be4
b7f07a7
 
 
 
42d7be4
b7f07a7
42d7be4
b7f07a7
42d7be4
b7f07a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42d7be4
 
 
 
 
b7f07a7
 
 
42d7be4
b7f07a7
42d7be4
b7f07a7
 
 
 
 
42d7be4
b7f07a7
42d7be4
7bee588
 
b7f07a7
42d7be4
b7f07a7
 
 
42d7be4
b7f07a7
42d7be4
b7f07a7
 
 
 
42d7be4
b7f07a7
42d7be4
b7f07a7
 
 
42d7be4
 
 
b7f07a7
4e044c9
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
---
base_model: google/medgemma-1.5-4b-it
library_name: peft
tags:
- base_model:adapter:google/medgemma-1.5-4b-it
- lora
- transformers
- medical
- dermatology
- multimodal
- vision-language
- change-detection
- temporal-analysis
license: other
datasets:
- dunktra/dermacheck-temporal-pairs
language:
- en
metrics:
- f1
- precision
- accuracy
- recall
pipeline_tag: image-text-to-text
---

# MedGemma Temporal Change Detection (LoRA Adapter)

This repository provides **LoRA adapters** fine-tuned on top of **google/medgemma-1.5-4b-it** for exploring **temporal change detection in dermatoscopic image pairs**.
The project investigates whether lightweight parameter-efficient fine-tuning can adapt a multimodal medical foundation model to a **novel temporal reasoning task**.



## Model Details

### Model Description

This repository contains LoRA adapters only, not a full model checkpoint.

- **Developed and shared by:** Dung Claire Tran ([@dunktra](https://huggingface.co/dunktra))
- **Base Model:** [google/medgemma-1.5-4b-it](https://huggingface.co/google/medgemma-1.5-4b-it)
- **Fine-Tuning Method:** LoRA (Low-Rank Adaptation, PEFT)
- **Model type:** Vision–Language Model (VLM) adapter
- **Task:** Binary classification of temporal change in skin lesion image pairs
- **Dataset:** [dunktra/dermacheck-temporal-pairs](https://huggingface.co/datasets/dunktra/dermacheck-temporal-pairs) (synthetic temporal pairs)
- **Language(s) (NLP):** English
- **License:** Inherits license from google/medgemma-1.5-4b-it 

### Model Sources

- **Repository:** [Kaggle notebook (training & evaluation)](https://www.kaggle.com/code/dungclairetran/dermacheck-medgemma-lora-fine-tuning)


## Uses


### Direct Use

- Research and experimentation with **temporal reasoning in medical imaging**
- Evaluation of **LoRA fine-tuning feasibility** on multimodal medical foundation models
- Educational and benchmarking purposes


### Out-of-Scope Use

- Clinical diagnosis or medical decision-making
- Deployment in real-world healthcare settings without clinical validation

This model is **not a medical device**.

## Limitations

- Fine-tuning effects may not surface when using **keyword-based label extraction**
- Binary classification may mask improvements in:
  - reasoning structure
  - explanatory language
  - uncertainty expression
- Synthetic temporal data limits real-world generalization
- Inherits all limitations of the base MedGemma model

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

## How to Get Started with the Model

Use the code below to get started with the model.

```
from transformers import AutoModelForVision2Seq, AutoProcessor
from peft import PeftModel
import torch

base_model = AutoModelForVision2Seq.from_pretrained(
    "google/medgemma-1.5-4b-it",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

model = PeftModel.from_pretrained(
    base_model,
    "dunktra/medgemma-temporal-lora"
)

processor = AutoProcessor.from_pretrained(
    "dunktra/medgemma-temporal-lora"
)
```

## Training Details

### Training Data

- **Source:** [dunktra/dermacheck-temporal-pairs](https://huggingface.co/datasets/dunktra/dermacheck-temporal-pairs)
- **Description:** Synthetic before/after dermatoscopic image pairs labeled for temporal change
- **Splits:**
  - **Training:** ~630 pairs
  - **Validation:** ~135 pairs
  - **Test:** 135 pairs
**Note:** *The dataset consists of **synthetic temporal pairs**, not real longitudinal patient data.*

### Training Configuration

- **LoRA Rank (r):** 8
- **LoRA Alpha:** 16
- **Target Modules:** q_proj, k_proj, v_proj, o_proj
- **LoRA Dropout:** 0.05
- **Epochs:** 3
- **Effective Batch Size:** 16
- **Learning Rate:** 2e-4
- **Precision:** bfloat16
- **Frameworks:** Transformers + PEFT

## Evaluation

#### Metrics

- Precision
- Recall
- F1 score (binary classification)

### Results (Test Set: 135 temporal pairs)

| Metric     | Base MedGemma | Fine-Tuned (LoRA) | Change |
|------------|---------------|-------------------|--------|
| F1 Score   | 0.8797        | 0.8797            | +0.00% |
| Precision  | 0.7852        | 0.7852            | +0.00% |
| Recall     | 1.0000        | 1.0000            | +0.00% |

LoRA fine-tuning **did not** yield measurable improvements under the current evaluation protocol.

**Note:** Although LoRA fine-tuning did not improve aggregate F1 on the held-out test set, analysis revealed that both the base and fine-tuned models collapsed to a high-recall regime, predicting “change” for all examples. This indicates that the primary performance bottleneck lies in task framing and decision extraction rather than model capacity. The experiment demonstrates stable LoRA adaptation without regression and highlights the importance of evaluation design in generative medical VLMs.

### Qualitative Analysis

- No test cases were found where the fine-tuned model corrected errors made by the base model.
- Fine-tuning did not alter binary decision outcomes given the current response-parsing heuristic.
  

## License

- This adapter inherits the license and usage restrictions of:
  - **google/medgemma-1.5-4b-it**
  - Underlying datasets used by the base model
- Non-commercial research use only.

## Acknowledgements

- Google MedGemma team
- PEFT / Hugging Face ecosystem
*Created for the **MedGemma Impact Challenge 2026 – Novel Task Exploration**.*

## Model Card Contact

[dunktra](https://huggingface.co/dunktra)
### Framework versions

- PEFT 0.18.1