Phi-3-mini Text-to-SQL — LoRA Adapter

A QLoRA adapter that specializes microsoft/Phi-3-mini-4k-instruct (3.8B) for natural-language → SQLite generation over a fixed enterprise schema (departments / employees / products / sales).

🔌 9 MB adapter (0.117% the size of the base model)
⚡ Trained in ~3 minutes within 5.2 GB of GPU memory on a 6 GB laptop GPU (RTX 4050)
🧪 75% execution-match / 100% valid-SQL on held-out questions (up from 41.7% for the base model)
📦 Quantized GGUFs for CPU serving: Bhuvandesai/phi3-text-to-sql-gguf
🖥️ Live demo: Bhuvandesai/phi3-text-to-sql-studio

How to use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = "microsoft/Phi-3-mini-4k-instruct"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, "Bhuvandesai/phi3-text-to-sql-adapter")

SCHEMA = """You are a Text-to-SQL generator. Given a database schema and a natural language
question, write a valid SQLite query. Output only the raw SQL.

Database Schema:
Table departments(id, name, manager_id)
Table employees(id, name, department_id, salary, hire_date, manager_id)
Table products(id, name, category, price)
Table sales(id, employee_id, product_id, amount, quantity, sale_date)"""

msgs = [{"role": "user", "content": f"{SCHEMA}\n\nQuestion: What is the average salary by department?"}]
prompt = tok.apply_chat_template(msgs, tokenize=False, add_generation_prompt=True)
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=128, do_sample=False)
print(tok.decode(out[0], skip_special_tokens=True))

For CPU / no-GPU use, prefer the quantized GGUFs with llama.cpp (see the GGUF repo).

Training


Method	QLoRA (4-bit NF4 + double-quant, bf16 compute)
LoRA	r=8, α=16, dropout=0.05, bias=none
Trainable params	4,456,448 (0.1165% of 3.82B)
Data	50 train / 12 held-out NL→SQL pairs (synthetic schema)
Schedule	3 epochs, effective batch 4, lr 2e-4 cosine, paged_adamw_8bit
Hardware	NVIDIA RTX 4050 Laptop (6 GB)
Runtime / peak VRAM	193.7 s / 5.21 GB reserved

Results (held-out, greedy decoding)

Metric	Base Phi-3-mini	This adapter
Execution-match (run SQL, compare rows)	41.7%	75.0%
Valid SQL rate	100%	100%
Eval loss (end of training)	—	0.0597 (−89.9%)
Eval token accuracy	—	98.4%

Strict execution-match is conservative: 2 of the 3 held-out "misses" are reasonable answers with a different column projection than the reference; counting "query correctly answers the question" ≈ 92%.

Limitations & honest notes

Single fixed schema. Trained on one synthetic database; it is not a general cross-schema text-to-SQL model.
Small dataset (50/12). Metrics are directional, not statistically tight.
LoRA module coverage. Because Phi-3 fuses q/k/v (qkv_proj) and gate/up (gate_up_proj), PEFT name-matching adapted only o_proj and down_proj (2 of the 7 listed modules). It still trained well; a future version should target qkv_proj/gate_up_proj for fuller coverage.

A full write-up (fine-tuning + quantization deep dive with all benchmarks) accompanies this model.

License: MIT (inherits from the base model).

Downloads last month: 53

Model tree for Bhuvandesai/phi3-text-to-sql-adapter

Base model

microsoft/Phi-3-mini-4k-instruct

Adapter

(854)

this model

Quantizations

1 model