--- license: apache-2.0 base_model: unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit library_name: peft tags: - medical - clinical - drug-recommendation - phi-4 - lora - qlora language: - en --- # Phi-4-mini Drug & Treatment Recommendation LoRA Adapter v6.1 Fine-tuned for evidence-based drug and treatment recommendation. Built on `unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit` using QLoRA (4-bit NF4). > **Clinical decision support only. Final decisions rest with the treating physician. This model is not a substitute for professional medical judgment.** ## Model Details | Parameter | Value | |---|---| | Base model | `unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit` | | Original base | `microsoft/Phi-4-mini-instruct` | | Method | QLoRA (4-bit NF4) | | LoRA rank | 16 | | Max sequence length | 1024 | | Training samples | 13,041 | | Final train loss | 0.2892 | | Loss masking | Response-only (`DataCollatorForCompletionOnlyLM`) | | Version | v6.1 | | License | Apache 2.0 | This repository contains **LoRA adapter weights only** (`adapter_model.safetensors`, ~35.7 MB). It must be loaded on top of the base model listed above — it is not a standalone model. ## Training Data The training set was drawn from a mix of publicly available medical Q&A and clinical reasoning sources, combined and curated to approximately 13,041 samples after preprocessing: - **Drug Q&A** (~5K) — drug information and usage question-answer pairs - **ChatDoctor** (~5K) — patient-doctor conversational dataset - **USMLE** (~1.5K) — US Medical Licensing Examination style questions - **Medical flashcards** (~3K) — condition/treatment recall pairs - **WikiDoc** (~2K) — community-authored clinical reference content Scope is **general/broad medicine** — the model is not tuned for a specific specialty, age group, or condition subset. Coverage across specialties, rare conditions, and edge cases is uneven and has not been formally audited. Note: source dataset sizes above sum to roughly 16.5K; the final training count of 13,041 reflects deduplication, filtering, and train/validation splitting during preprocessing. ## Intended Use - Research and educational exploration of LLM-based clinical decision support - Prototyping and evaluation of medical recommendation interfaces - Assisting *licensed clinicians* as a secondary reference, never a primary decision source ### Out of Scope - Direct-to-patient diagnostic or prescribing use without clinician oversight - Emergency or urgent care guidance - Use as a sole or final source for dosing, contraindications, or drug interactions - Any deployment context implying regulatory clearance or clinical validation (this model has none) ## How to Use This is a LoRA adapter — load the base model first, then apply the adapter. The tokenizer and chat template should be loaded **from this repo**, not from the base model, since they were saved alongside the trained adapter. ```bash pip install torch transformers peft accelerate bitsandbytes # unsloth may be required depending on how the bnb-4bit base loads in your environment: pip install unsloth ``` ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base_model_id = "unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit" adapter_id = "BlueMen/Medrx" # Load tokenizer + chat template from THIS repo (not the base model) tokenizer = AutoTokenizer.from_pretrained(adapter_id) # Base model ships pre-quantized (bnb-4bit), so no extra quant config needed base_model = AutoModelForCausalLM.from_pretrained( base_model_id, device_map="auto", trust_remote_code=True, ) model = PeftModel.from_pretrained(base_model, adapter_id) model.eval() messages = [ {"role": "user", "content": "Patient presents with persistent dry cough and mild fever for 5 days. Suggest possible treatment."} ] inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt", ).to(model.device) with torch.no_grad(): output = model.generate( inputs, max_new_tokens=300, temperature=0.3, do_sample=True, ) response = tokenizer.decode(output[0][inputs.shape[-1]:], skip_special_tokens=True) print(response) ``` ### Optional: Merging the adapter For deployment without the `peft` dependency, the adapter can be merged into the base weights: ```python merged_model = model.merge_and_unload() merged_model.save_pretrained("medrx-merged") tokenizer.save_pretrained("medrx-merged") ``` Note that merging onto a 4-bit quantized base is lossy in some configurations — for highest-fidelity merges, consider re-loading the base in full precision before merging if disk/memory allow. ## Limitations and Risks - **Not clinically validated.** This model has not been evaluated in a clinical trial, peer-reviewed study, or regulatory review process of any kind. - **No regulatory clearance.** Not FDA-cleared, not CE-marked, not approved as a medical device in any jurisdiction. - **Hallucination risk.** Like all LLMs, this model can generate plausible-sounding but incorrect drug names, dosages, contraindications, or interactions. All outputs require verification against authoritative clinical references before any real-world use. - **Training data limitations.** Source datasets include community-authored and synthetic conversational content (e.g. WikiDoc, ChatDoctor) alongside exam-style questions (USMLE); none of these are equivalent to curated clinical guidelines or a licensed formulary. - **No safety fine-tuning audit.** No formal red-teaming or bias evaluation has been conducted on this adapter's outputs. - **General scope only.** Not specialized for pediatrics, geriatrics, pregnancy, rare disease, or specialty-specific protocols. ## Citation If you use this adapter, please cite this repository: ``` @misc{medrx2025, title = {Phi-4-mini Drug & Treatment Recommendation LoRA Adapter v6.1}, author = {BlueMen}, year = {2025}, url = {https://huggingface.co/BlueMen/Medrx} } ```