README.md · Johnyquest7/Med_soap_llama321

Med_soap_llama321_tinker / README.md

Johnyquest7

Update README.md

612dd7c verified 6 months ago

preview code

raw

history blame contribute delete

4.65 kB

	---
	language:
	- en
	license: other
	tags:
	- llama-3.2
	- medical
	- clinical-notes
	- soap-notes
	- sft
	- lora
	- peft
	- tinker
	- instruction-following
	base_model: meta-llama/Llama-3.2-1B
	library_name: transformers
	datasets:
	- Johnyquest7/med_struct_data
	model-index:
	- name: Med_Soap_llama321
	results: []
	pipeline_tag: text-generation
	---

	# Med_Soap_llama321

	Med_Soap_llama321 is a fine-tuned derivative of `meta-llama/Llama-3.2-1B` trained to convert medical visit transcripts into structured SOAP-style clinical notes.
	Training used LoRA adapters with Tinker (training SDK & cookbook) and the outputs were merged into the base model for standalone use.

	> Intended use: assistive drafting of structured notes from clinician–patient transcripts. Outputs should be reviewed and edited by qualified clinicians before use in any clinical workflow.

	---

	## Quick start (🤗 Transformers)

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	MODEL_ID = "johnyquest7/Med_soap_llama321_tinker"

	tok = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
	model = AutoModelForCausalLM.from_pretrained(
	MODEL_ID,
	torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
	device_map="auto"
	)

	# Minimal prompt — the model was trained on transcripts whose first line begins with:
	# "Please convert the following medical transcript into a structured medical note."
	prompt = """Please convert the following medical transcript into a structured medical note.

	Doctor: Hi there, good to see you again. How have you been feeling?
	Patient: I've been more tired and a bit dizzy...
	"""

	inputs = tok([prompt], return_tensors="pt").to(model.device)
	with torch.no_grad():
	out = model.generate(
	**inputs,
	max_new_tokens=512,
	do_sample=True,
	temperature=0.2,
	top_p=0.95,
	eos_token_id=tok.eos_token_id,
	)
	print(tok.decode(out[0], skip_special_tokens=True))

	```

	Training summary

	Base model: meta-llama/Llama-3.2-1B

	Task: supervised fine-tuning on pairs (transcript → structured note)

	Data: Johnyquest7/med_struct_data (95% train / 5% eval)

	Formatting: chat-style conversations with a single user turn (transcript) and single assistant turn (note); the user message includes the instruction line:
	Please convert the following medical transcript into a structured medical note.

	Frameworks: Tinker (trainer/cookbook) + PEFT/LoRA; final weights merged for HF usage.

	Typical knobs: LoRA rank 32, max seq length ~8k, linear LR schedule, batch ~16.

	Renderer: Tinker recommended renderer for Llama 3.2 (“role_colon” template)
	Train objective: Cross-entropy on assistant turns (ALL_ASSISTANT_MESSAGES)
	Logging: JSONL metrics (train/eval NLL); optional W&B
	Checkpointing: periodic state saves; final merge via peft.merge_and_unload()

	Inference prompt tips

	Keep the opening instruction line exactly as seen during training (above).

	Provide the verbatim transcript (doctor/patient turns) below the instruction.

	For longer visits, raise max_new_tokens (e.g., 768–1024).

	For more deterministic outputs, lower temperature (0.1–0.3).

	Evaluation

	During training we tracked negative log-likelihood (NLL) on train and a 5% eval split.
	For downstream quality checks, we recommend:

	ROUGE-L / BLEU vs. reference notes (style similarity)

	Section presence (Subjective, Objective, Assessment, Plan)

	Clinical validity spot checks by a clinician (e.g., vitals, meds, labs copied correctly)

	Training log

	![image](https://cdn-uploads.huggingface.co/production/uploads/64f953dd83807928d272c944/3EczRoKpCFZ6xG_u9LjHu.png)

	Limitations & risks

	May hallucinate facts not stated in the transcript or omit pertinent positives/negatives.

	Outputs can reflect biases and errors present in training data.

	Not a medical device; requires human review. Do not use for autonomous clinical decisions.

	How this model was built

	Prepare conversations JSONL: each line

	{"messages":[
	{"role":"user","content":"Please convert... <transcript>"},
	{"role":"assistant","content":"<structured note>"}
	]}


	Supervised Fine-Tuning with Tinker (LoRA adapters), renderer set to the recommended Llama-3.2 chat template.

	Merge adapters into base with peft.merge_and_unload() and save in safetensors format for HF.

	Citation

	If you found this model helpful, please cite:

	Base model: Meta Llama 3.2

	This model: johnyquest7/Med_Soap_llama321_tinker

	@software{Med_Soap_llama321_2025,
	title = {Med\_Soap\_llama321},
	author = {Johnson Thomas},
	year = {2025},
	url = {https://huggingface.co/johnyquest7/Med_Soap_llama321_tinker}
	}