Upload folder using huggingface_hub

9de498f verified about 2 months ago

1.99 kB

license: apache-2.0
base_model: google/gemma-2-2b-it
tags:
  - medical
  - clinical-notes
  - patient-communication
  - lora
  - peft
  - medgemma
language:
  - en
library_name: peft

gemma-2b-distilled

Gemma-2B fine-tuned via knowledge distillation from Gemma-9B-DPO for clinical note simplification

Model Details

Base model: google/gemma-2-2b-it
Training method: SFT distillation from 9B-DPO teacher (600 samples)
Task: Clinical note simplification for patient communication
License: Apache 2.0

Performance

Metric	Score
Overall	70%
Accuracy	73%
Patient-Centered	76%

Evaluated by MedGemma-27B (google/medgemma-27b-text-it) on 50 held-out clinical notes.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "dejori/note-explain-gemma-2b-distilled")

# Generate
prompt = "Simplify this clinical note for a patient:\n\n[your clinical note]\n\nSimplified version:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Data

Trained on dejori/note-explain-clinical dataset.

Citation

@misc{noteexplain2026,
  title={NoteExplain: Privacy-First Clinical Note Simplification},
  author={Dejori, Mathaeus},
  year={2026},
  publisher={HuggingFace}
}

Dataset: dejori/note-explain-clinical
Project: MedGemma Impact Challenge Submission

dejori
/

note-explain