Instructions to use Edifon/SOAP_SFT_V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Edifon/SOAP_SFT_V1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Edifon/SOAP_SFT_V1")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("Edifon/SOAP_SFT_V1")
model = AutoModelForMultimodalLM.from_pretrained("Edifon/SOAP_SFT_V1")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Edifon/SOAP_SFT_V1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Edifon/SOAP_SFT_V1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Edifon/SOAP_SFT_V1",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Edifon/SOAP_SFT_V1

SGLang

How to use Edifon/SOAP_SFT_V1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Edifon/SOAP_SFT_V1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Edifon/SOAP_SFT_V1",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Edifon/SOAP_SFT_V1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Edifon/SOAP_SFT_V1",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Unsloth Studio

How to use Edifon/SOAP_SFT_V1 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Edifon/SOAP_SFT_V1 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Edifon/SOAP_SFT_V1 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Edifon/SOAP_SFT_V1 to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Edifon/SOAP_SFT_V1",
    max_seq_length=2048,
)

Docker Model Runner
How to use Edifon/SOAP_SFT_V1 with Docker Model Runner:
```
docker model run hf.co/Edifon/SOAP_SFT_V1
```

SOAP_SFT_V1 — Medical SOAP Note Generator

SOAP_SFT_V1 is a fine-tuned version of Gemma 3 4B Instruct, trained to generate structured clinical SOAP notes (Subjective, Objective, Assessment, Plan) from doctor–patient dialogues.

Trained 2x faster with Unsloth and Hugging Face's TRL library on an H100 GPU.

Model Details

Property	Value
Developed by	Edifon
Base model	`unsloth/gemma-3-4b-it-unsloth-bnb-4bit`
Model type	Causal Language Model (fine-tuned)
Language	English
License	Apache 2.0
Fine-tuning method	Supervised Fine-Tuning (SFT) with LoRA
Training hardware	Google Colab H100

Intended Use

This model is designed to assist healthcare professionals and clinical NLP researchers by automatically converting clinical consultation transcripts into structured SOAP notes.

SOAP format:

S (Subjective): Patient-reported symptoms, history, and complaints
O (Objective): Observable/measurable clinical findings and planned investigations
A (Assessment): Differential diagnosis and clinical reasoning
P (Plan): Treatment plan, referrals, and follow-up instructions

⚠️ Disclaimer: This model is intended as a research and assistive tool only. It is not a substitute for professional medical judgment or a licensed clinician's evaluation.

Training Details

Dataset

Dataset: syafiqassegaf/soap-dataset (Kaggle)
Total examples: 9,250
Train / Eval split: 90% / 10% → 8,325 train | 925 eval
Features: dialogue, soap, prompt, messages

LoRA Configuration

Parameter	Value
Rank (`r`)	8
Alpha (`lora_alpha`)	8
Dropout	0
Bias	none
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
Trainable parameters	16,394,240 / 4,316,473,712 (0.38%)
Vision layers finetuned	No
Language layers finetuned	Yes

Training Hyperparameters

Parameter	Value
Epochs	5
Per-device batch size	2
Gradient accumulation steps	4 (effective batch size = 8)
Learning rate	2e-5
LR scheduler	Linear
Optimizer	AdamW 8-bit
Weight decay	0.001
Warmup steps	5
Max sequence length	2048
Seed	3407
Total steps	5,205

Training used train_on_responses_only — only model responses were used in the loss computation, not the user instructions.

How to Use

With `transformers` (Standard)

from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Edifon/SOAP_SFT_V1")
model = AutoModelForImageTextToText.from_pretrained("Edifon/SOAP_SFT_V1", device_map="auto")

messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": (
            "You are an expert medical professor assisting in the creation of medically accurate SOAP summaries. "
            "Please ensure the response follows the structured format: S:, O:, A:, P: without using markdown or special formatting."
        )}],
    },
    {
        "role": "user",
        "content": [{"type": "text", "text": """Create a medical SOAP summary of this dialogue.

### Dialogue:
Doctor: Hello, what brings you in today?
Patient: I've been having severe headaches for the past few weeks...
[rest of dialogue]
"""}],
    },
]

inputs = processor.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)

from transformers import TextStreamer
_ = model.generate(
    **inputs,
    max_new_tokens=2048,
    streamer=TextStreamer(processor, skip_prompt=True),
)

With Unsloth (Faster Inference)

from unsloth import FastModel

model, tokenizer = FastModel.from_pretrained(
    model_name="Edifon/SOAP_SFT_V1",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastModel.for_inference(model)

Example Output

Input dialogue (excerpt):

Patient reports photopsia in the left eye for ten days, including flashes of light and a dark spot on the nasal side. Had influenza-like symptoms two weeks prior. No history of eye disease.

Model output:

S: Patient reports experiencing photopsia in the left eye for ten days, describing flashes of light
   and a dark spot on the nasal side. History of influenza-like symptoms two weeks prior.
   No prior eye disease, operations, or treatments.

O: Patient presented with photopsia and a dark spot in the left eye. Comprehensive eye examination
   planned (visual acuity, slit-lamp, fundus examination).

A: Differential includes post-infectious transient optic neuropathy or acute ocular involvement
   secondary to influenza. Absence of prior eye disease supports opportunistic onset.

P: Order comprehensive eye examination. Schedule follow-up to review results and determine
   treatment or referral plan. Encourage prompt completion of planned examination.

Training Curve

Metric	Value
Initial loss (step 100)	0.941
Final loss (step 5200)	0.482
Total reduction	~48.8%

The model converged stably over 5 epochs / 5,205 steps. Loss dropped sharply in the first ~300 steps as the model learned the SOAP output format, then decayed gradually through step ~2,000, before plateauing in the 0.48–0.52 range for the final two epochs with no significant overfitting observed.

Limitations

Trained exclusively on English-language dialogues
Performance may degrade on highly specialized subspecialty consultations underrepresented in the training data
Should not be used for clinical decision-making without expert oversight
Outputs may occasionally include disclaimers or formatting inconsistencies

Citation

If you use this model in your research, please cite the base model and dataset:

@misc{soap_sft_v1,
  author       = {Edifon},
  title        = {SOAP\_SFT\_V1: Medical SOAP Note Generator},
  year         = {2025},
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/Edifon/SOAP_SFT_V1}
}

Downloads last month: 50

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for Edifon/SOAP_SFT_V1

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Quantized

unsloth/gemma-3-4b-it-unsloth-bnb-4bit

Finetuned

(1120)

this model

Edifon
/

SOAP_SFT_V1

SOAP_SFT_V1 — Medical SOAP Note Generator

Model Details

Intended Use

Training Details

Dataset

LoRA Configuration

Training Hyperparameters

How to Use

With `transformers` (Standard)

With Unsloth (Faster Inference)

Example Output

Training Curve

Limitations

Citation

Model tree for Edifon/SOAP_SFT_V1

Space using Edifon/SOAP_SFT_V1 1

SOAP_SFT_V1 — Medical SOAP Note Generator

Model Details

Intended Use

Training Details

Dataset

LoRA Configuration

Training Hyperparameters

How to Use

With transformers (Standard)

With Unsloth (Faster Inference)

Example Output

Training Curve

Limitations

Citation

Model tree for Edifon/SOAP_SFT_V1

Space using Edifon/SOAP_SFT_V1 1

With `transformers` (Standard)