phi2-lora-math / README.md
darshjoshi16's picture
Update README.md
aed782f verified
|
raw
history blame
2.66 kB

license: apache-2.0 tags: - peft - lora - math - reasoning - gsm8k - phi-2 - transformers library_name: peft base_model: microsoft/phi-2 model_type: causal-lm

🧠 Phi-2 LoRA Adapter for GSM8K (Math Word Problems)

This repository contains a parameter-efficient LoRA fine-tuning of microsoft/phi-2 on the GSM8K dataset, designed for solving grade-school arithmetic and reasoning problems in natural language.

✅ Adapter-only: This is a LoRA adapter, not a full model. You must load it on top of microsoft/phi-2.


✨ What's Inside

  • Base Model: microsoft/phi-2 (1.7B parameters)
  • Adapter Type: LoRA (Low-Rank Adaptation via PEFT)
  • Task: Grade-school math reasoning (multi-step logic and arithmetic)
  • Dataset: GSM8K

🚀 Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2")
tokenizer = AutoTokenizer.from_pretrained("your-username/phi2-lora-math")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "your-username/phi2-lora-math")

# Inference
prompt = "Q: Julie read 12 pages yesterday and twice as many today. If she wants to read half of the remaining 84 pages tomorrow, how many pages should she read?\nA:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))


⸻

📊 Evaluation Results

Task	Metric	Score	Samples
GSM8K	Exact Match (strict)	54.6%	500
ARC-Easy	Accuracy	79.0%	500
HellaSwag	Accuracy (Normalized)	61.0%	500

Benchmarks were run using EleutherAI’s lm-eval-harness

⸻

⚙️ Training Details
    •	Method: LoRA (rank=8, alpha=16, dropout=0.1)
    •	Epochs: 1 (proof of concept)
    •	Batch size: 4 per device
    •	Precision: FP16
    •	Platform: Google Colab (T4 GPU)
    •	Framework: 🤗 Transformers + PEFT

⸻

🔍 Limitations
    •	Fine-tuned for math problems only (not general-purpose reasoning)
    •	Trained for 1 epoch — additional training may improve performance
    •	Adapter-only: base model (microsoft/phi-2) must be loaded alongside

⸻

📘 Citation & References
    •	LoRA: Low-Rank Adaptation
    •	Phi-2 Model Card
    •	GSM8K Dataset
    •	PEFT Library
    •	Transformers

⸻

💬 Author

This model was fine-tuned and open-sourced by Darsh Joshi (contact@darshjoshi.com).
Feel free to reach out or contribute.