metadata
license: apache-2.0
base_model: google/gemma-2-2b-it
tags:
- medical
- clinical-notes
- patient-communication
- lora
- peft
- medgemma
language:
- en
library_name: peft
gemma-2b-distilled
Gemma-2B fine-tuned via knowledge distillation from Gemma-9B-DPO for clinical note simplification
Model Details
- Base model: google/gemma-2-2b-it
- Training method: SFT distillation from 9B-DPO teacher (600 samples)
- Task: Clinical note simplification for patient communication
- License: Apache 2.0
Performance
| Metric | Score |
|---|---|
| Overall | 70% |
| Accuracy | 73% |
| Patient-Centered | 76% |
Evaluated by MedGemma-27B (google/medgemma-27b-text-it) on 50 held-out clinical notes.
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load base model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "dejori/note-explain-gemma-2b-distilled")
# Generate
prompt = "Simplify this clinical note for a patient:\n\n[your clinical note]\n\nSimplified version:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Data
Trained on dejori/note-explain-clinical dataset.
Citation
@misc{noteexplain2026,
title={NoteExplain: Privacy-First Clinical Note Simplification},
author={Dejori, Mathaeus},
year={2026},
publisher={HuggingFace}
}
Related
- Dataset: dejori/note-explain-clinical
- Project: MedGemma Impact Challenge Submission