Medrx / README.md

Upload README.md with huggingface_hub

bed164c verified 2 days ago

6.08 kB

license: apache-2.0
base_model: unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit
library_name: peft
tags:
  - medical
  - clinical
  - drug-recommendation
  - phi-4
  - lora
  - qlora
language:
  - en

Phi-4-mini Drug & Treatment Recommendation LoRA Adapter v6.1

Fine-tuned for evidence-based drug and treatment recommendation. Built on unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit using QLoRA (4-bit NF4).

Clinical decision support only. Final decisions rest with the treating physician. This model is not a substitute for professional medical judgment.

Model Details

Parameter	Value
Base model	`unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit`
Original base	`microsoft/Phi-4-mini-instruct`
Method	QLoRA (4-bit NF4)
LoRA rank	16
Max sequence length	1024
Training samples	13,041
Final train loss	0.2892
Loss masking	Response-only (`DataCollatorForCompletionOnlyLM`)
Version	v6.1
License	Apache 2.0

This repository contains LoRA adapter weights only (adapter_model.safetensors, ~35.7 MB). It must be loaded on top of the base model listed above — it is not a standalone model.

Training Data

The training set was drawn from a mix of publicly available medical Q&A and clinical reasoning sources, combined and curated to approximately 13,041 samples after preprocessing:

Drug Q&A (~5K) — drug information and usage question-answer pairs
ChatDoctor (~5K) — patient-doctor conversational dataset
USMLE (~1.5K) — US Medical Licensing Examination style questions
Medical flashcards (~3K) — condition/treatment recall pairs
WikiDoc (~2K) — community-authored clinical reference content

Scope is general/broad medicine — the model is not tuned for a specific specialty, age group, or condition subset. Coverage across specialties, rare conditions, and edge cases is uneven and has not been formally audited.

Note: source dataset sizes above sum to roughly 16.5K; the final training count of 13,041 reflects deduplication, filtering, and train/validation splitting during preprocessing.

Intended Use

Research and educational exploration of LLM-based clinical decision support
Prototyping and evaluation of medical recommendation interfaces
Assisting licensed clinicians as a secondary reference, never a primary decision source

Out of Scope

Direct-to-patient diagnostic or prescribing use without clinician oversight
Emergency or urgent care guidance
Use as a sole or final source for dosing, contraindications, or drug interactions
Any deployment context implying regulatory clearance or clinical validation (this model has none)

How to Use

This is a LoRA adapter — load the base model first, then apply the adapter. The tokenizer and chat template should be loaded from this repo, not from the base model, since they were saved alongside the trained adapter.

pip install torch transformers peft accelerate bitsandbytes
# unsloth may be required depending on how the bnb-4bit base loads in your environment:
pip install unsloth

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit"
adapter_id = "BlueMen/Medrx"

# Load tokenizer + chat template from THIS repo (not the base model)
tokenizer = AutoTokenizer.from_pretrained(adapter_id)

# Base model ships pre-quantized (bnb-4bit), so no extra quant config needed
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(base_model, adapter_id)
model.eval()

messages = [
    {"role": "user", "content": "Patient presents with persistent dry cough and mild fever for 5 days. Suggest possible treatment."}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

with torch.no_grad():
    output = model.generate(
        inputs,
        max_new_tokens=300,
        temperature=0.3,
        do_sample=True,
    )

response = tokenizer.decode(output[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)

Optional: Merging the adapter

For deployment without the peft dependency, the adapter can be merged into the base weights:

merged_model = model.merge_and_unload()
merged_model.save_pretrained("medrx-merged")
tokenizer.save_pretrained("medrx-merged")

Note that merging onto a 4-bit quantized base is lossy in some configurations — for highest-fidelity merges, consider re-loading the base in full precision before merging if disk/memory allow.

Limitations and Risks

Not clinically validated. This model has not been evaluated in a clinical trial, peer-reviewed study, or regulatory review process of any kind.
No regulatory clearance. Not FDA-cleared, not CE-marked, not approved as a medical device in any jurisdiction.
Hallucination risk. Like all LLMs, this model can generate plausible-sounding but incorrect drug names, dosages, contraindications, or interactions. All outputs require verification against authoritative clinical references before any real-world use.
Training data limitations. Source datasets include community-authored and synthetic conversational content (e.g. WikiDoc, ChatDoctor) alongside exam-style questions (USMLE); none of these are equivalent to curated clinical guidelines or a licensed formulary.
No safety fine-tuning audit. No formal red-teaming or bias evaluation has been conducted on this adapter's outputs.
General scope only. Not specialized for pediatrics, geriatrics, pregnancy, rare disease, or specialty-specific protocols.

Citation

If you use this adapter, please cite this repository:

@misc{medrx2025,
  title  = {Phi-4-mini Drug & Treatment Recommendation LoRA Adapter v6.1},
  author = {BlueMen},
  year   = {2025},
  url    = {https://huggingface.co/BlueMen/Medrx}
}