Phi-3 Mini โ Financial Q&A (QLoRA Fine-Tuned)
Fine-tuned version of Phi-3-mini-4k-instruct on financial Q&A data using QLoRA (4-bit quantization + LoRA adapters).
Training Details
- Base model: microsoft/Phi-3-mini-4k-instruct (3.8B parameters)
- Method: QLoRA (LoRA rank=16, 4-bit NF4 quantization)
- Dataset: gbharti/finance-alpaca (5,000 samples)
- Training steps: 500
- Hardware: NVIDIA T4 16GB (Kaggle free tier)
- Training time: ~45 minutes
Results
| Model | ROUGE-1 | ROUGE-L | Latency | Cost/query |
|---|---|---|---|---|
| Base Phi-3 Mini | 0.238 | 0.131 | 11.4s | $0.000 |
| Fine-tuned Phi-3 Mini | 0.223 | 0.146 | 10.5s | $0.000 |
| GPT-4o (Azure) | 0.282 | 0.133 | 3.1s | ~$0.030 |
+11.5% ROUGE-L improvement over base model at zero inference cost.
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
tokenizer = AutoTokenizer.from_pretrained("rohan1324/phi3-mini-finance-qlora")
base = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-4k-instruct",
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=False,
attn_implementation="eager"
)
model = PeftModel.from_pretrained(base, "rohan1324/phi3-mini-finance-qlora")
prompt = "### Instruction:\nWhat is EBITDA?\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Use Cases
- Financial document Q&A
- Earnings call summarization
- Financial exam study assistant
- Internal analyst copilot
Author
Rohan Agarwal
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for rohan1324/phi3-mini-finance-qlora
Base model
microsoft/Phi-3-mini-4k-instruct