Irfanuruchi
/

phi-2-alpaca-lora

+---
+license: mit
+tags:
+- phi2
+- alpaca
+- instruction-tuning
+- causal-lm
+- lora
+datasets:
+- yahma/alpaca-cleaned
+- custom
+base_model: microsoft/phi-2
+---
+# Phi‑2‑Alpaca‑LoRA
+[![GitHub Repo](https://img.shields.io/badge/GitHub-phi--2--alpaca--lora-181717?style=for-the-badge&logo=github)](https://github.com/IrfanUruchi/phi-2-alpaca-lora)
+[![Model Weights](https://img.shields.io/badge/🤗-Model_Weights-FFD21F?style=for-the-badge)](https://huggingface.co/Irfanuruchi/phi-2-alpaca-lora)
+[![License](https://img.shields.io/badge/License-MIT-blue.svg?style=for-the-badge)](https://huggingface.co/microsoft/phi-2/blob/main/LICENSE)
+---
+### Overview
+This repository contains LoRA‑tuned weights for **microsoft/phi‑2 (2.7B)**.
+The adapters were trained on:
+- [yahma/alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned) (~5k instructions)
+- Custom instruction datasets (collected separately)
+Targets: `q_proj`, `k_proj`, `v_proj`, `dense` layers within the transformer.
+Adapters were merged after training to produce a standalone Hugging Face checkpoint.
+---
+### Training setup
+- **LoRA config**: rank=16, α=32, dropout=0.05
+- **Max seq length**: 256
+- **Optimizer**: AdamW, lr=2e‑4
+- **Precision**: bf16 (fp16 fallback)
+---
+### Example
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_id = "<your-hf-username>/phi-2-alpaca-lora"
+tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.float16,
+    device_map="auto",
+    trust_remote_code=True,
+)
+prompt = "### Instruction: List three advantages of modular code.\n### Response:"
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+with torch.inference_mode():
+    outputs = model.generate(
+        **inputs,
+        max_new_tokens=200,
+        temperature=0.7,
+        top_p=0.9,
+    )
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+---
+## Limitations
+- Context length capped at 256 tokens
+- Can return hallucinated or biased content
+- Output tone/style depends on Alpaca + custom data
+---