| --- |
| language: |
| - en |
| license: other |
| tags: |
| - llama-3.2 |
| - medical |
| - clinical-notes |
| - soap-notes |
| - sft |
| - lora |
| - peft |
| - tinker |
| - instruction-following |
| base_model: meta-llama/Llama-3.2-1B |
| library_name: transformers |
| datasets: |
| - Johnyquest7/med_struct_data |
| model-index: |
| - name: Med_Soap_llama321 |
| results: [] |
| pipeline_tag: text-generation |
| --- |
| |
| # Med_Soap_llama321 |
|
|
| **Med_Soap_llama321** is a fine-tuned derivative of **`meta-llama/Llama-3.2-1B`** trained to convert **medical visit transcripts** into **structured SOAP-style clinical notes**. |
| Training used **LoRA** adapters with **Tinker** (training SDK & cookbook) and the outputs were **merged** into the base model for standalone use. |
|
|
| > **Intended use**: assistive drafting of structured notes from clinician–patient transcripts. Outputs should be reviewed and edited by qualified clinicians before use in any clinical workflow. |
|
|
| --- |
|
|
| ## Quick start (🤗 Transformers) |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| import torch |
| |
| MODEL_ID = "johnyquest7/Med_soap_llama321_tinker" |
| |
| tok = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True) |
| model = AutoModelForCausalLM.from_pretrained( |
| MODEL_ID, |
| torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32, |
| device_map="auto" |
| ) |
| |
| # Minimal prompt — the model was trained on transcripts whose first line begins with: |
| # "Please convert the following medical transcript into a structured medical note." |
| prompt = """Please convert the following medical transcript into a structured medical note. |
| |
| Doctor: Hi there, good to see you again. How have you been feeling? |
| Patient: I've been more tired and a bit dizzy... |
| """ |
| |
| inputs = tok([prompt], return_tensors="pt").to(model.device) |
| with torch.no_grad(): |
| out = model.generate( |
| **inputs, |
| max_new_tokens=512, |
| do_sample=True, |
| temperature=0.2, |
| top_p=0.95, |
| eos_token_id=tok.eos_token_id, |
| ) |
| print(tok.decode(out[0], skip_special_tokens=True)) |
| |
| ``` |
|
|
| Training summary |
|
|
| Base model: meta-llama/Llama-3.2-1B |
|
|
| Task: supervised fine-tuning on pairs (transcript → structured note) |
|
|
| Data: Johnyquest7/med_struct_data (95% train / 5% eval) |
|
|
| Formatting: chat-style conversations with a single user turn (transcript) and single assistant turn (note); the user message includes the instruction line: |
| Please convert the following medical transcript into a structured medical note. |
|
|
| Frameworks: Tinker (trainer/cookbook) + PEFT/LoRA; final weights merged for HF usage. |
|
|
| Typical knobs: LoRA rank 32, max seq length ~8k, linear LR schedule, batch ~16. |
|
|
| Renderer: Tinker recommended renderer for Llama 3.2 (“role_colon” template) |
| Train objective: Cross-entropy on assistant turns (ALL_ASSISTANT_MESSAGES) |
| Logging: JSONL metrics (train/eval NLL); optional W&B |
| Checkpointing: periodic state saves; final merge via peft.merge_and_unload() |
| |
| Inference prompt tips |
| |
| Keep the opening instruction line exactly as seen during training (above). |
| |
| Provide the verbatim transcript (doctor/patient turns) below the instruction. |
| |
| For longer visits, raise max_new_tokens (e.g., 768–1024). |
| |
| For more deterministic outputs, lower temperature (0.1–0.3). |
| |
| Evaluation |
| |
| During training we tracked negative log-likelihood (NLL) on train and a 5% eval split. |
| For downstream quality checks, we recommend: |
| |
| ROUGE-L / BLEU vs. reference notes (style similarity) |
| |
| Section presence (Subjective, Objective, Assessment, Plan) |
| |
| Clinical validity spot checks by a clinician (e.g., vitals, meds, labs copied correctly) |
| |
| Training log |
| |
|  |
| |
| Limitations & risks |
| |
| May hallucinate facts not stated in the transcript or omit pertinent positives/negatives. |
| |
| Outputs can reflect biases and errors present in training data. |
| |
| Not a medical device; requires human review. Do not use for autonomous clinical decisions. |
| |
| How this model was built |
| |
| Prepare conversations JSONL: each line |
| |
| {"messages":[ |
| {"role":"user","content":"Please convert... <transcript>"}, |
| {"role":"assistant","content":"<structured note>"} |
| ]} |
| |
| |
| Supervised Fine-Tuning with Tinker (LoRA adapters), renderer set to the recommended Llama-3.2 chat template. |
| |
| Merge adapters into base with peft.merge_and_unload() and save in safetensors format for HF. |
| |
| Citation |
| |
| If you found this model helpful, please cite: |
| |
| Base model: Meta Llama 3.2 |
| |
| This model: johnyquest7/Med_Soap_llama321_tinker |
|
|
| @software{Med_Soap_llama321_2025, |
| title = {Med\_Soap\_llama321}, |
| author = {Johnson Thomas}, |
| year = {2025}, |
| url = {https://huggingface.co/johnyquest7/Med_Soap_llama321_tinker} |
| } |