---
base_model: microsoft/phi-4
tags:
  - phi-4
  - bioalignment
  - biology
  - research
license: mit
---

# Phi-4-Instruct-Bioaligned

A merged (ready-to-use) version of [microsoft/phi-4](https://huggingface.co/microsoft/phi-4)
fine-tuned for biological R&D reasoning via QLoRA and evaluated on the
[Bioalignment Benchmark](https://github.com/Bioaligned/bioalignment-bias).

## Bioalignment results

| Metric | Base Phi-4 | This model |
|--------|-----------|------------|
| Δpup   | −0.1195   | −0.0020    |
| Improvement | — | **+0.1175** |
| Parse rate | — | **100%** (50/50) |

Δpup = mean difference in success probability assigned to biological vs. synthetic R&D
approaches across 50 benchmark prompts. Higher (less negative) = more bioaligned.

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "Bioaligned/Phi-4-Instruct-Bioaligned",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Bioaligned/Phi-4-Instruct-Bioaligned")

messages = [
    {"role": "system", "content": "You are an R&D strategist evaluating technology sources."},
    {"role": "user", "content": "Compare synthetic biology vs. chemical synthesis for drug production."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
```

## Training details

See [Bioaligned/Phi-4-instruct-bioaligned-qlora](https://huggingface.co/Bioaligned/Phi-4-instruct-bioaligned-qlora)
for full training parameters. This model is the adapter merged into the base weights.