base_model: google/medgemma-4b-it library_name: peft pipeline_tag: text-generation license: mit language:

  • en tags:
  • lora
  • transformers
  • medical
  • clinical-documentation
  • soap-notes
  • medgemma
  • hai-def
  • medgemma-impact-challenge


MedScribe SOAP LoRA β€” Concise Clinical Note Generation

LoRA adapter for google/medgemma-4b-it that generates concise, clinician-ready SOAP notes from medical encounter transcripts.

Built for the Google MedGemma Impact Challenge 2026.

What This Model Does

Converts medical encounter transcripts into structured SOAP (Subjective, Objective, Assessment, Plan) notes written in the concise shorthand that clinicians actually use β€” not the verbose textbook prose that base models default to.

Example:

Input transcript "54-year-old female presenting with shortness of breath. CT chest shows filling defects in segmental branches of right lower lobe..."
Base MedGemma ~200 words, textbook prose, over-specified plan with 6-8 items
This adapter ~104 words, clinical shorthand ("54 yo F c/o SOB"), focused 2-4 item plan

Key Metrics

Metric Base MedGemma With This Adapter
Avg word count ~200+ 104
Section completeness (S/O/A/P) 85-95% 100%
Hallucinated findings 5-10% 0%
WNL shortcuts Present 0%
Clinical style Textbook verbose Shorthand
PLAN items 4-8 2-4 (focused)
Quality score β€” 90/100

Usage

python

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load base model (4-bit quantized)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

base_model = AutoModelForCausalLM.from_pretrained(
"google/medgemma-4b-it",
    quantization_config=bnb_config,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/medgemma-4b-it")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model,"Tushar-9802/medscribe-soap-lora")
model.eval()

# Generate SOAP note
prompt ="""You are a clinical documentation assistant. Convert the following medical
text into a structured SOAP note.

MEDICAL TEXT:
{your_transcript_here}

Generate a SOAP note with these sections:
- SUBJECTIVE: Patient-reported symptoms and history
- OBJECTIVE: Physical exam findings and vital signs
- ASSESSMENT: Clinical impressions and diagnoses
- PLAN: Diagnostic tests, treatments, and follow-up

Write a complete PLAN (treatments, monitoring, follow-up). End with a full sentence.
SOAP NOTE:"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.inference_mode():
    outputs = model.generate(
**inputs,
        max_new_tokens=400,
        min_new_tokens=150,
        do_sample=False,
        use_cache=True,
)
result = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(result)

Training Details

Training Data

712 curated transcript-SOAP pairs generated via GPT-4o Mini API ($1.28 total). Dataset: Tushar-9802/medscribe-soap-712

Each sample enforces:

  • "Not documented in source" for any finding absent from the input transcript
  • Zero WNL (Within Normal Limits) shortcuts β€” every finding explicitly stated
  • Concise clinical shorthand style
  • PLAN with specific, actionable items

Training Configuration

Parameter Value
Base model google/medgemma-4b-it
Method LoRA
Rank 16
Alpha 32
Dropout 0.1
Target modules All attention layers
Trainable parameters ~4.2M (0.1% of 4B base)
Batch size 2 (Γ— 8 gradient accumulation = effective 16)
Learning rate 2e-5
Epochs 5 (early stopping patience: 2)
Precision BFloat16
Quantization 4-bit NF4 during training
Hardware NVIDIA RTX 5070 Ti (16GB VRAM)

Training Results

Metric Value
Training loss 0.828
Validation loss 0.782
Overfitting None (val < train)

Anti-Hallucination Behavior

The adapter was specifically trained to avoid clinical hallucination. When the input transcript does not contain information for a SOAP section, the model outputs "Not documented in source" rather than fabricating findings. This is critical for clinical safety β€” a missing field that is explicitly marked as missing is far safer than a plausible-sounding fabrication.

Intended Use

  • Converting medical encounter transcripts to structured SOAP notes
  • Clinical documentation assistance (with physician review)
  • Research and demonstration of efficient medical LLM fine-tuning

Limitations

  • English only
  • Research prototype β€” not validated for clinical use in any jurisdiction
  • Synthetic training data β€” 712 samples generated by GPT-4o Mini, not from real clinical encounters
  • Requires physician review β€” all generated notes must be reviewed and approved by a licensed clinician before use in patient care
  • Inference speed β€” ~25 seconds per note on RTX 5070 Ti with 4-bit quantization

Part Of

This adapter is one component of MedScribe, a clinical documentation workstation that combines MedASR (speech recognition), this fine-tuned MedGemma adapter (SOAP generation), and base MedGemma (clinical intelligence tools) into a single offline pipeline.

Framework Versions

  • PEFT 0.18.1
  • Transformers 4.52+
  • PyTorch 2.8+ (nightly for Blackwell/SM 12.0)
  • bitsandbytes 0.45+

Citation

bibtex

@misc{medscribe2026,
  author = {Tushar},
  title = {MedScribe: Concise Clinical Documentation via Fine-tuned MedGemma},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/Tushar-9802/medscribe-soap-lora}
}

Contact

GitHub: @Tushar-9802

Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Tushar9802/medscribe-soap-lora

Adapter
(77)
this model