Gaykar
/

Phi2-drug_data

@@ -6,202 +6,318 @@ tags:
 - base_model:adapter:microsoft/phi-2
 - lora
 - transformers
 ---
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]
-### Framework versions
-- PEFT 0.18.1

 - base_model:adapter:microsoft/phi-2
 - lora
 - transformers
+license: cc-by-nc-4.0
+datasets:
+- Gaykar/DrugData
 ---
 # Model Card for Model ID
+This model is a LoRA-based fine-tuned variant of Microsoft Phi-2, designed to generate concise, medical-style textual descriptions of drugs.
+Given a drug name as input, the model produces a short, single-paragraph description following an instruction-style prompt format.
+The training pipeline consists of two stages:
+Continued Pretraining (CPT) on domain-relevant medical and pharmaceutical text to adapt the base model to the language and terminology of the domain.
+Supervised Fine-Tuning (SFT) using structured drug name–description pairs to guide the model toward consistent formatting and domain-specific writing style.
+This model is intended **strictly for educational and research purposes** and must not be used for real-world medical, clinical, or decision-making applications.
+---
+## Model Details
+### Model Description
+This model is a parameter-efficient fine-tuned version of the Microsoft Phi-2 language model, adapted to generate concise medical drug descriptions from drug names. The training pipeline consists of two stages:
+1. **Continued Pretraining (CPT)** to adapt the base model to drug and medical terminology.
+2. **Supervised Fine-Tuning (SFT)** using instruction-style input–output pairs.
+LoRA adapters were used during fine-tuning to reduce memory usage and training cost while preserving base model knowledge.
+- **Developed by:** Atharva Gaykar
+- **Funded by:** Not applicable
+- **Shared by:** Atharva Gaykar
+- **Model type:** Causal Language Model (LoRA-adapted)
+- **Language(s) (NLP):** English
+- **License:** CC-BY-NC 4.0
+- **Finetuned from model:** microsoft/phi-2
+---
+## Uses
+This model is designed to generate concise medical-style descriptions of drugs given their names.
+### Direct Use
+- Educational demonstrations of instruction-following language models
+- Academic research on medical-domain adaptation
+- Experimentation with CPT + SFT pipelines
+- Studying hallucination behavior in domain-specific LLMs
+The model should only be used in **non-production, educational, or research settings**.
 ### Out-of-Scope Use
+This model is **not designed or validated** for:
+- Medical diagnosis or treatment planning
+- Clinical decision support systems
+- Dosage recommendations or prescribing guidance
+- Patient-facing healthcare applications
+- Professional medical, pharmaceutical, or regulatory use
+- Any real-world deployment where incorrect medical information could cause harm
+---
 ## Bias, Risks, and Limitations
+This model was developed **solely for educational purposes** and **must not be used in real-world medical or clinical decision-making**.
+### Known Limitations
+- May hallucinate incorrect drug indications or mechanisms
+- Generated descriptions may be incomplete or outdated
+- Does not verify outputs against authoritative medical sources
+- Does not understand patient context, dosage, or drug interactions
+- Output quality is sensitive to prompt phrasing
+### Risks
+- Misinterpretation of outputs as medical advice
+- Overconfidence in fluent but inaccurate responses
+- Potential propagation of misinformation if misused
 ### Recommendations
+- Always verify outputs using trusted medical references
+- Use only in controlled, non-production environments
+- Clearly disclose limitations in any downstream use
+- Avoid deployment in safety-critical or healthcare systems
+---
 ## How to Get Started with the Model
+This repository contains **LoRA adapter weights**, not a full model.
+Example usage (conceptual):
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+# Load base model and tokenizer
+base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2")
+tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2")
+# Load LoRA adapter
+model = PeftModel.from_pretrained(base_model, "Gaykar/Phi2-drug_data")
+model.eval()
+import torch
+# Drug to evaluate
+drug_name = "Paracetamol"
+# Build evaluation prompt
+eval_prompt = (
+    "Generate exactly ONE sentence describing the drug.\n"
+    "Do not include headings or extra information.\n\n"
+    f"Drug Name: {drug_name}\n"
+    "Description:"
+)
+# Tokenize prompt
+model_input = tokenizer(
+    eval_prompt,
+    return_tensors="pt"
+).to(model.device)
+# Generate output (greedy decoding)
+with torch.no_grad():
+    output = model.generate(
+        **model_input,
+        do_sample=False,
+        num_beams=1,             # Greedy decoding (This decision is critical for this model because it operates in the medical domain, where factual consistency and determinism are more important than linguistic diversity.)
+        max_new_tokens=120,
+        repetition_penalty=1.1,
+        eos_token_id=tokenizer.eos_token_id
+    )
+# Remove prompt tokens
+prompt_length = model_input["input_ids"].shape[1]
+generated_tokens = output[0][prompt_length:]
+# Decode generated text only
+generated_text = tokenizer.decode(
+    generated_tokens,
+    skip_special_tokens=True
+).strip()
+# Enforce single-sentence output
+if "." in generated_text:
+    generated_text = generated_text.split(".")[0] + "."
+print(" DRUG NAME:", drug_name)
+print(" MODEL GENERATED DESCRIPTION:")
+print(generated_text)
+#Example output
+DRUG NAME (EVAL): Paracetamol
+MODEL GENERATED DESCRIPTION:
+Paracetamol (acetaminophen) is a non-narcotic analgesic and antipyretic used to relieve mild to moderate pain and reduce fever.
+````
+---
+## Training Details
+### Training Data
+* **Dataset:** Gaykar/DrugData
+* Structured drug name–description pairs
+* Used for both CPT (domain adaptation) and SFT (instruction following)
+### Training Procedure
+#### Continued Pretraining (CPT)
+The base model was further trained on domain-relevant medical and drug-related text to improve familiarity with terminology and style. CPT focused on next-token prediction without instruction formatting.
+#### Supervised Fine-Tuning (SFT)
+After CPT, the model was fine-tuned using instruction-style prompts to generate concise medical descriptions from drug names.
+#### Training Hyperparameters
+**CPT Hyperparameters**
+| Hyperparameter          | Value               |
+| ----------------------- | ------------------- |
+| Batch size (per device) | 1                   |
+| Effective batch size    | 8                   |
+| Epochs                  | 4                   |
+| Learning rate           | 2e-4                |
+| Precision               | FP16                |
+| Optimizer               | Paged AdamW (8-bit) |
+| Logging steps           | 10                  |
+| Checkpoint saving       | Every 500 steps     |
+| Checkpoint limit        | 2                   |
+**SFT Hyperparameters**
+| Hyperparameter          | Value               |
+| ----------------------- | ------------------- |
+| Batch size (per device) | 4                   |
+| Gradient accumulation   | 1                   |
+| Effective batch size    | 4                   |
+| Epochs                  | 5                   |
+| Learning rate           | 2e-5                |
+| LR scheduler            | Linear              |
+| Warmup ratio            | 6%                  |
+| Weight decay            | 1e-4                |
+| Max gradient norm       | 1.0                 |
+| Precision               | FP16                |
+| Optimizer               | Paged AdamW (8-bit) |
+| Checkpoint saving       | Every 50 steps      |
+| Checkpoint limit        | 2                   |
+| Experiment tracking     | Weights & Biases    |
+---
+## Evaluation
+### Testing Data
+Drug names sampled from the same dataset were used for evaluation. Outputs were assessed for factual correctness using an external LLM-based evaluation approach.
+### Metrics
+**Evaluation Method:** LLM-as-a-Judge (Chatgpt -Web seacrch available. )
+* Binary classification: Factually Correct / Hallucinated
+* Three evaluation batches
+### Results
+**Batch 1**
+| Category              | Count | Percentage |
+| --------------------- | ----- | ---------- |
+| Total Drugs Evaluated | 25    | 100%       |
+| Factually Correct     | 24    | 96%        |
+| Hallucinated / Failed | 1     | 4%         |
+**Batch 2**
+| Category              | Count | Percentage |
+| --------------------- | ----- | ---------- |
+| Total Drugs Evaluated | 25    | 100%       |
+| Factually Correct     | 22    | 88%        |
+| Hallucinated / Failed | 3     | 12%        |
+**Batch 3**
+| Category              | Count | Percentage |
+| --------------------- | ----- | ---------- |
+| Total Drugs Evaluated | 22    | 100%       |
+| Factually Correct     | 15    | 68%        |
+| Hallucinated / Failed | 0     | 0%         |
+#### Summary
+Since this model was fine-tuned (SFT+CPT) using LoRA rather than full-parameter fine-tuning, eliminating hallucinations entirely is challenging. While LoRA enables efficient training and strong instruction-following behavior, it does not fully overwrite the base model’s internal knowledge. Despite this limitation, the model performs well for educational and research-oriented drug description generation tasks.
+---
+## Environmental Impact
+* **Hardware Type:** NVIDIA T4 GPU
+* **Hours used:** Not recorded
+* **Cloud Provider:** Google Colab
+* **Compute Region:** Not specified
+* **Carbon Emitted:** Not estimated
+---
+## Technical Specifications
+### Model Architecture and Objective
+* Base model: Microsoft Phi-2
+* Objective: Instruction-following text generation
+* Adaptation method: LoRA (PEFT)
+### Compute Infrastructure
+#### Hardware
+* NVIDIA T4 GPU
+#### Software
+* Transformers
+* PEFT
+* PyTorch
+---
+## Model Card Contact
+Atharva Gaykar
+### Framework Versions
+* PEFT 0.18.0