---
license: apache-2.0
base_model: unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit
library_name: peft
tags:
  - medical
  - clinical
  - drug-recommendation
  - phi-4
  - lora
  - qlora
language:
  - en
---

# Phi-4-mini Drug & Treatment Recommendation LoRA Adapter v6.1

Fine-tuned for evidence-based drug and treatment recommendation.
Built on `unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit` using QLoRA (4-bit NF4).

> **Clinical decision support only. Final decisions rest with the treating physician. This model is not a substitute for professional medical judgment.**

## Model Details

| Parameter | Value |
|---|---|
| Base model | `unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit` |
| Original base | `microsoft/Phi-4-mini-instruct` |
| Method | QLoRA (4-bit NF4) |
| LoRA rank | 16 |
| Max sequence length | 1024 |
| Training samples | 13,041 |
| Final train loss | 0.2892 |
| Loss masking | Response-only (`DataCollatorForCompletionOnlyLM`) |
| Version | v6.1 |
| License | Apache 2.0 |

This repository contains **LoRA adapter weights only** (`adapter_model.safetensors`, ~35.7 MB). It must be loaded on top of the base model listed above — it is not a standalone model.

## Training Data

The training set was drawn from a mix of publicly available medical Q&A and clinical reasoning sources, combined and curated to approximately 13,041 samples after preprocessing:

- **Drug Q&A** (~5K) — drug information and usage question-answer pairs
- **ChatDoctor** (~5K) — patient-doctor conversational dataset
- **USMLE** (~1.5K) — US Medical Licensing Examination style questions
- **Medical flashcards** (~3K) — condition/treatment recall pairs
- **WikiDoc** (~2K) — community-authored clinical reference content

Scope is **general/broad medicine** — the model is not tuned for a specific specialty, age group, or condition subset. Coverage across specialties, rare conditions, and edge cases is uneven and has not been formally audited.

Note: source dataset sizes above sum to roughly 16.5K; the final training count of 13,041 reflects deduplication, filtering, and train/validation splitting during preprocessing.

## Intended Use

- Research and educational exploration of LLM-based clinical decision support
- Prototyping and evaluation of medical recommendation interfaces
- Assisting *licensed clinicians* as a secondary reference, never a primary decision source

### Out of Scope

- Direct-to-patient diagnostic or prescribing use without clinician oversight
- Emergency or urgent care guidance
- Use as a sole or final source for dosing, contraindications, or drug interactions
- Any deployment context implying regulatory clearance or clinical validation (this model has none)

## How to Use

This is a LoRA adapter — load the base model first, then apply the adapter. The tokenizer and chat template should be loaded **from this repo**, not from the base model, since they were saved alongside the trained adapter.

```bash
pip install torch transformers peft accelerate bitsandbytes
# unsloth may be required depending on how the bnb-4bit base loads in your environment:
pip install unsloth
```

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "unsloth/Phi-4-mini-instruct-unsloth-bnb-4bit"
adapter_id = "BlueMen/Medrx"

# Load tokenizer + chat template from THIS repo (not the base model)
tokenizer = AutoTokenizer.from_pretrained(adapter_id)

# Base model ships pre-quantized (bnb-4bit), so no extra quant config needed
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(base_model, adapter_id)
model.eval()

messages = [
    {"role": "user", "content": "Patient presents with persistent dry cough and mild fever for 5 days. Suggest possible treatment."}
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

with torch.no_grad():
    output = model.generate(
        inputs,
        max_new_tokens=300,
        temperature=0.3,
        do_sample=True,
    )

response = tokenizer.decode(output[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)
```

### Optional: Merging the adapter

For deployment without the `peft` dependency, the adapter can be merged into the base weights:

```python
merged_model = model.merge_and_unload()
merged_model.save_pretrained("medrx-merged")
tokenizer.save_pretrained("medrx-merged")
```

Note that merging onto a 4-bit quantized base is lossy in some configurations — for highest-fidelity merges, consider re-loading the base in full precision before merging if disk/memory allow.

## Limitations and Risks

- **Not clinically validated.** This model has not been evaluated in a clinical trial, peer-reviewed study, or regulatory review process of any kind.
- **No regulatory clearance.** Not FDA-cleared, not CE-marked, not approved as a medical device in any jurisdiction.
- **Hallucination risk.** Like all LLMs, this model can generate plausible-sounding but incorrect drug names, dosages, contraindications, or interactions. All outputs require verification against authoritative clinical references before any real-world use.
- **Training data limitations.** Source datasets include community-authored and synthetic conversational content (e.g. WikiDoc, ChatDoctor) alongside exam-style questions (USMLE); none of these are equivalent to curated clinical guidelines or a licensed formulary.
- **No safety fine-tuning audit.** No formal red-teaming or bias evaluation has been conducted on this adapter's outputs.
- **General scope only.** Not specialized for pediatrics, geriatrics, pregnancy, rare disease, or specialty-specific protocols.

## Citation

If you use this adapter, please cite this repository:

```
@misc{medrx2025,
  title  = {Phi-4-mini Drug & Treatment Recommendation LoRA Adapter v6.1},
  author = {BlueMen},
  year   = {2025},
  url    = {https://huggingface.co/BlueMen/Medrx}
}
```