---
base_model: unsloth/gemma-3-4b-it-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- gemma3
- medical
- clinical-nlp
- soap-notes
license: apache-2.0
language:
- en
---

# SOAP_SFT_V1 — Medical SOAP Note Generator

**SOAP_SFT_V1** is a fine-tuned version of [Gemma 3 4B Instruct](https://huggingface.co/unsloth/gemma-3-4b-it-unsloth-bnb-4bit), trained to generate structured clinical **SOAP notes** (Subjective, Objective, Assessment, Plan) from doctor–patient dialogues.

Trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library on an H100 GPU.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

---

## Model Details

| Property | Value |
|---|---|
| **Developed by** | Edifon |
| **Base model** | `unsloth/gemma-3-4b-it-unsloth-bnb-4bit` |
| **Model type** | Causal Language Model (fine-tuned) |
| **Language** | English |
| **License** | Apache 2.0 |
| **Fine-tuning method** | Supervised Fine-Tuning (SFT) with LoRA |
| **Training hardware** | Google Colab H100 |

---

## Intended Use

This model is designed to assist healthcare professionals and clinical NLP researchers by automatically converting clinical consultation transcripts into structured SOAP notes.

**SOAP format:**
- **S (Subjective):** Patient-reported symptoms, history, and complaints
- **O (Objective):** Observable/measurable clinical findings and planned investigations
- **A (Assessment):** Differential diagnosis and clinical reasoning
- **P (Plan):** Treatment plan, referrals, and follow-up instructions

> ⚠️ **Disclaimer:** This model is intended as a research and assistive tool only. It is **not** a substitute for professional medical judgment or a licensed clinician's evaluation.

---

## Training Details

### Dataset
- **Dataset:** [`syafiqassegaf/soap-dataset`](https://www.kaggle.com/datasets/syafiqassegaf/soap-dataset) (Kaggle)
- **Total examples:** 9,250
- **Train / Eval split:** 90% / 10% → 8,325 train | 925 eval
- **Features:** `dialogue`, `soap`, `prompt`, `messages`

### LoRA Configuration

| Parameter | Value |
|---|---|
| Rank (`r`) | 8 |
| Alpha (`lora_alpha`) | 8 |
| Dropout | 0 |
| Bias | none |
| Target modules | `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` |
| Trainable parameters | 16,394,240 / 4,316,473,712 (**0.38%**) |
| Vision layers finetuned | No |
| Language layers finetuned | Yes |

### Training Hyperparameters

| Parameter | Value |
|---|---|
| Epochs | 5 |
| Per-device batch size | 2 |
| Gradient accumulation steps | 4 (effective batch size = 8) |
| Learning rate | 2e-5 |
| LR scheduler | Linear |
| Optimizer | AdamW 8-bit |
| Weight decay | 0.001 |
| Warmup steps | 5 |
| Max sequence length | 2048 |
| Seed | 3407 |
| Total steps | 5,205 |

Training used `train_on_responses_only` — only model responses were used in the loss computation, not the user instructions.

---

## How to Use

### With `transformers` (Standard)

```python
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Edifon/SOAP_SFT_V1")
model = AutoModelForImageTextToText.from_pretrained("Edifon/SOAP_SFT_V1", device_map="auto")

messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": (
            "You are an expert medical professor assisting in the creation of medically accurate SOAP summaries. "
            "Please ensure the response follows the structured format: S:, O:, A:, P: without using markdown or special formatting."
        )}],
    },
    {
        "role": "user",
        "content": [{"type": "text", "text": """Create a medical SOAP summary of this dialogue.

### Dialogue:
Doctor: Hello, what brings you in today?
Patient: I've been having severe headaches for the past few weeks...
[rest of dialogue]
"""}],
    },
]

inputs = processor.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)

from transformers import TextStreamer
_ = model.generate(
    **inputs,
    max_new_tokens=2048,
    streamer=TextStreamer(processor, skip_prompt=True),
)
```

### With Unsloth (Faster Inference)

```python
from unsloth import FastModel

model, tokenizer = FastModel.from_pretrained(
    model_name="Edifon/SOAP_SFT_V1",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastModel.for_inference(model)
```

---

## Example Output

**Input dialogue (excerpt):**
> Patient reports photopsia in the left eye for ten days, including flashes of light and a dark spot on the nasal side. Had influenza-like symptoms two weeks prior. No history of eye disease.

**Model output:**
```
S: Patient reports experiencing photopsia in the left eye for ten days, describing flashes of light
   and a dark spot on the nasal side. History of influenza-like symptoms two weeks prior.
   No prior eye disease, operations, or treatments.

O: Patient presented with photopsia and a dark spot in the left eye. Comprehensive eye examination
   planned (visual acuity, slit-lamp, fundus examination).

A: Differential includes post-infectious transient optic neuropathy or acute ocular involvement
   secondary to influenza. Absence of prior eye disease supports opportunistic onset.

P: Order comprehensive eye examination. Schedule follow-up to review results and determine
   treatment or referral plan. Encourage prompt completion of planned examination.
```

---

## Training Curve

| Metric | Value |
|---|---|
| Initial loss (step 100) | 0.941 |
| Final loss (step 5200) | 0.482 |
| Total reduction | ~48.8% |

The model converged stably over 5 epochs / 5,205 steps. Loss dropped sharply in the first ~300 steps as the model learned the SOAP output format, then decayed gradually through step ~2,000, before plateauing in the 0.48–0.52 range for the final two epochs with no significant overfitting observed.

![Training Loss Curve](training_loss.png)

---

## Limitations

- Trained exclusively on English-language dialogues
- Performance may degrade on highly specialized subspecialty consultations underrepresented in the training data
- Should not be used for clinical decision-making without expert oversight
- Outputs may occasionally include disclaimers or formatting inconsistencies

---

## Citation

If you use this model in your research, please cite the base model and dataset:

```bibtex
@misc{soap_sft_v1,
  author       = {Edifon},
  title        = {SOAP\_SFT\_V1: Medical SOAP Note Generator},
  year         = {2025},
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/Edifon/SOAP_SFT_V1}
}
```