Medrx / README.md
BlueMen's picture
Upload README.md with huggingface_hub
bed164c verified
|
Raw
History Blame Contribute Delete
6.08 kB
metadata
license: apache-2.0
base_model: unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit
library_name: peft
tags:
  - medical
  - clinical
  - drug-recommendation
  - phi-4
  - lora
  - qlora
language:
  - en

Phi-4-mini Drug & Treatment Recommendation LoRA Adapter v6.1

Fine-tuned for evidence-based drug and treatment recommendation. Built on unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit using QLoRA (4-bit NF4).

Clinical decision support only. Final decisions rest with the treating physician. This model is not a substitute for professional medical judgment.

Model Details

Parameter Value
Base model unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit
Original base microsoft/Phi-4-mini-instruct
Method QLoRA (4-bit NF4)
LoRA rank 16
Max sequence length 1024
Training samples 13,041
Final train loss 0.2892
Loss masking Response-only (DataCollatorForCompletionOnlyLM)
Version v6.1
License Apache 2.0

This repository contains LoRA adapter weights only (adapter_model.safetensors, ~35.7 MB). It must be loaded on top of the base model listed above — it is not a standalone model.

Training Data

The training set was drawn from a mix of publicly available medical Q&A and clinical reasoning sources, combined and curated to approximately 13,041 samples after preprocessing:

  • Drug Q&A (~5K) — drug information and usage question-answer pairs
  • ChatDoctor (~5K) — patient-doctor conversational dataset
  • USMLE (~1.5K) — US Medical Licensing Examination style questions
  • Medical flashcards (~3K) — condition/treatment recall pairs
  • WikiDoc (~2K) — community-authored clinical reference content

Scope is general/broad medicine — the model is not tuned for a specific specialty, age group, or condition subset. Coverage across specialties, rare conditions, and edge cases is uneven and has not been formally audited.

Note: source dataset sizes above sum to roughly 16.5K; the final training count of 13,041 reflects deduplication, filtering, and train/validation splitting during preprocessing.

Intended Use

  • Research and educational exploration of LLM-based clinical decision support
  • Prototyping and evaluation of medical recommendation interfaces
  • Assisting licensed clinicians as a secondary reference, never a primary decision source

Out of Scope

  • Direct-to-patient diagnostic or prescribing use without clinician oversight
  • Emergency or urgent care guidance
  • Use as a sole or final source for dosing, contraindications, or drug interactions
  • Any deployment context implying regulatory clearance or clinical validation (this model has none)

How to Use

This is a LoRA adapter — load the base model first, then apply the adapter. The tokenizer and chat template should be loaded from this repo, not from the base model, since they were saved alongside the trained adapter.

pip install torch transformers peft accelerate bitsandbytes
# unsloth may be required depending on how the bnb-4bit base loads in your environment:
pip install unsloth
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit"
adapter_id = "BlueMen/Medrx"

# Load tokenizer + chat template from THIS repo (not the base model)
tokenizer = AutoTokenizer.from_pretrained(adapter_id)

# Base model ships pre-quantized (bnb-4bit), so no extra quant config needed
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(base_model, adapter_id)
model.eval()

messages = [
    {"role": "user", "content": "Patient presents with persistent dry cough and mild fever for 5 days. Suggest possible treatment."}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

with torch.no_grad():
    output = model.generate(
        inputs,
        max_new_tokens=300,
        temperature=0.3,
        do_sample=True,
    )

response = tokenizer.decode(output[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)

Optional: Merging the adapter

For deployment without the peft dependency, the adapter can be merged into the base weights:

merged_model = model.merge_and_unload()
merged_model.save_pretrained("medrx-merged")
tokenizer.save_pretrained("medrx-merged")

Note that merging onto a 4-bit quantized base is lossy in some configurations — for highest-fidelity merges, consider re-loading the base in full precision before merging if disk/memory allow.

Limitations and Risks

  • Not clinically validated. This model has not been evaluated in a clinical trial, peer-reviewed study, or regulatory review process of any kind.
  • No regulatory clearance. Not FDA-cleared, not CE-marked, not approved as a medical device in any jurisdiction.
  • Hallucination risk. Like all LLMs, this model can generate plausible-sounding but incorrect drug names, dosages, contraindications, or interactions. All outputs require verification against authoritative clinical references before any real-world use.
  • Training data limitations. Source datasets include community-authored and synthetic conversational content (e.g. WikiDoc, ChatDoctor) alongside exam-style questions (USMLE); none of these are equivalent to curated clinical guidelines or a licensed formulary.
  • No safety fine-tuning audit. No formal red-teaming or bias evaluation has been conducted on this adapter's outputs.
  • General scope only. Not specialized for pediatrics, geriatrics, pregnancy, rare disease, or specialty-specific protocols.

Citation

If you use this adapter, please cite this repository:

@misc{medrx2025,
  title  = {Phi-4-mini Drug & Treatment Recommendation LoRA Adapter v6.1},
  author = {BlueMen},
  year   = {2025},
  url    = {https://huggingface.co/BlueMen/Medrx}
}