---
base_model: microsoft/phi-2
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:microsoft/phi-2
- lora
- transformers
license: cc-by-nc-4.0
datasets:
- Gaykar/DrugData
---

# Model Card for Model ID

This model is a LoRA-based fine-tuned variant of Microsoft Phi-2, designed to generate concise, medical-style textual descriptions of drugs.
Given a drug name as input, the model produces a short, single-paragraph description following an instruction-style prompt format.

The training pipeline consists of two stages:

Continued Pretraining (CPT) on domain-relevant medical and pharmaceutical text to adapt the base model to the language and terminology of the domain.

Supervised Fine-Tuning (SFT) using structured drug name–description pairs to guide the model toward consistent formatting and domain-specific writing style.


This model is intended **strictly for educational and research purposes** and must not be used for real-world medical, clinical, or decision-making applications.

---

## Model Details

### Model Description

This model is a parameter-efficient fine-tuned version of the Microsoft Phi-2 language model, adapted to generate concise medical drug descriptions from drug names. The training pipeline consists of two stages:

1. **Continued Pretraining (CPT)** to adapt the base model to drug and medical terminology.
2. **Supervised Fine-Tuning (SFT)** using instruction-style input–output pairs.

LoRA adapters were used during fine-tuning to reduce memory usage and training cost while preserving base model knowledge.

- **Developed by:** Atharva Gaykar  
- **Funded by:** Not applicable  
- **Shared by:** Atharva Gaykar  
- **Model type:** Causal Language Model (LoRA-adapted)  
- **Language(s) (NLP):** English  
- **License:** CC-BY-NC 4.0  
- **Finetuned from model:** microsoft/phi-2  

---

## Uses

This model is designed to generate concise medical-style descriptions of drugs given their names.

### Direct Use

- Educational demonstrations of instruction-following language models  
- Academic research on medical-domain adaptation  
- Experimentation with CPT + SFT pipelines  
- Studying hallucination behavior in domain-specific LLMs  

The model should only be used in **non-production, educational, or research settings**.

### Out-of-Scope Use

This model is **not designed or validated** for:

- Medical diagnosis or treatment planning  
- Clinical decision support systems  
- Dosage recommendations or prescribing guidance  
- Patient-facing healthcare applications  
- Professional medical, pharmaceutical, or regulatory use  
- Any real-world deployment where incorrect medical information could cause harm  

---

## Bias, Risks, and Limitations

This model was developed **solely for educational purposes** and **must not be used in real-world medical or clinical decision-making**.

### Known Limitations

- May hallucinate incorrect drug indications or mechanisms  
- Generated descriptions may be incomplete or outdated  
- Does not verify outputs against authoritative medical sources  
- Does not understand patient context, dosage, or drug interactions  
- Output quality is sensitive to prompt phrasing  

### Risks

- Misinterpretation of outputs as medical advice  
- Overconfidence in fluent but inaccurate responses  
- Potential propagation of misinformation if misused  

### Recommendations

- Always verify outputs using trusted medical references  
- Use only in controlled, non-production environments  
- Clearly disclose limitations in any downstream use  
- Avoid deployment in safety-critical or healthcare systems  

---

## How to Get Started with the Model

This repository contains **LoRA adapter weights**, not a full model.

Example usage (conceptual):

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2")
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Gaykar/Phi2-drug_data")

model.eval()


import torch

# Drug to evaluate
drug_name = "Paracetamol"

# Build evaluation prompt
eval_prompt = (
    "Generate exactly ONE sentence describing the drug.\n"
    "Do not include headings or extra information.\n\n"
    f"Drug Name: {drug_name}\n"
    "Description:"
)

# Tokenize prompt
model_input = tokenizer(
    eval_prompt,
    return_tensors="pt"
).to(model.device)

# Generate output (greedy decoding)
with torch.no_grad():
    output = model.generate(
        **model_input,
        do_sample=False,
        num_beams=1,             # Greedy decoding (This decision is critical for this model because it operates in the medical domain, where factual consistency and determinism are more important than linguistic diversity.)
        max_new_tokens=120,
        repetition_penalty=1.1,
        eos_token_id=tokenizer.eos_token_id
    )

# Remove prompt tokens
prompt_length = model_input["input_ids"].shape[1]
generated_tokens = output[0][prompt_length:]

# Decode generated text only
generated_text = tokenizer.decode(
    generated_tokens,
    skip_special_tokens=True
).strip()

# Enforce single-sentence output
if "." in generated_text:
    generated_text = generated_text.split(".")[0] + "."

print(" DRUG NAME:", drug_name)
print(" MODEL GENERATED DESCRIPTION:")
print(generated_text)

#Example output
DRUG NAME (EVAL): Paracetamol

MODEL GENERATED DESCRIPTION:
Paracetamol (acetaminophen) is a non-narcotic analgesic and antipyretic used to relieve mild to moderate pain and reduce fever.

````

---

## Training Details

### Training Data

* **Dataset:** Gaykar/DrugData
* Structured drug name–description pairs
* Used for both CPT (domain adaptation) and SFT (instruction following)

### Training Procedure

#### Continued Pretraining (CPT)

The base model was further trained on domain-relevant medical and drug-related text to improve familiarity with terminology and style. CPT focused on next-token prediction without instruction formatting.

#### Supervised Fine-Tuning (SFT)

After CPT, the model was fine-tuned using instruction-style prompts to generate concise medical descriptions from drug names.

#### Training Hyperparameters

**CPT Hyperparameters**

| Hyperparameter          | Value               |
| ----------------------- | ------------------- |
| Batch size (per device) | 1                   |
| Effective batch size    | 8                   |
| Epochs                  | 4                   |
| Learning rate           | 2e-4                |
| Precision               | FP16                |
| Optimizer               | Paged AdamW (8-bit) |
| Logging steps           | 10                  |
| Checkpoint saving       | Every 500 steps     |
| Checkpoint limit        | 2                   |

**SFT Hyperparameters**

| Hyperparameter          | Value               |
| ----------------------- | ------------------- |
| Batch size (per device) | 4                   |
| Gradient accumulation   | 1                   |
| Effective batch size    | 4                   |
| Epochs                  | 5                   |
| Learning rate           | 2e-5                |
| LR scheduler            | Linear              |
| Warmup ratio            | 6%                  |
| Weight decay            | 1e-4                |
| Max gradient norm       | 1.0                 |
| Precision               | FP16                |
| Optimizer               | Paged AdamW (8-bit) |
| Checkpoint saving       | Every 50 steps      |
| Checkpoint limit        | 2                   |
| Experiment tracking     | Weights & Biases    |

---

## Evaluation

### Testing Data

Drug names sampled from the same dataset were used for evaluation. Outputs were assessed for factual correctness using an external LLM-based evaluation approach.

### Metrics

**Evaluation Method:** LLM-as-a-Judge (Chatgpt -Web seacrch available. )

* Binary classification: Factually Correct / Hallucinated
* Three evaluation batches

### Results

**Batch 1**

| Category              | Count | Percentage |
| --------------------- | ----- | ---------- |
| Total Drugs Evaluated | 25    | 100%       |
| Factually Correct     | 24    | 96%        |
| Hallucinated / Failed | 1     | 4%         |

**Batch 2**

| Category              | Count | Percentage |
| --------------------- | ----- | ---------- |
| Total Drugs Evaluated | 25    | 100%       |
| Factually Correct     | 22    | 88%        |
| Hallucinated / Failed | 3     | 12%        |

**Batch 3**

| Category              | Count | Percentage |
| --------------------- | ----- | ---------- |
| Total Drugs Evaluated | 22    | 100%       |
| Factually Correct     | 15    | 68%        |
| Hallucinated / Failed | 0     | 0%         |

#### Summary

Since this model was fine-tuned (SFT+CPT) using LoRA rather than full-parameter fine-tuning, eliminating hallucinations entirely is challenging. While LoRA enables efficient training and strong instruction-following behavior, it does not fully overwrite the base model’s internal knowledge. Despite this limitation, the model performs well for educational and research-oriented drug description generation tasks.

---

## Environmental Impact

* **Hardware Type:** NVIDIA T4 GPU
* **Hours used:** Not recorded
* **Cloud Provider:** Google Colab
* **Compute Region:** Not specified
* **Carbon Emitted:** Not estimated

---

## Technical Specifications

### Model Architecture and Objective

* Base model: Microsoft Phi-2
* Objective: Instruction-following text generation
* Adaptation method: LoRA (PEFT)

### Compute Infrastructure

#### Hardware

* NVIDIA T4 GPU

#### Software

* Transformers
* PEFT
* PyTorch

---

## Model Card Contact

Atharva Gaykar

### Framework Versions

* PEFT 0.18.0