Phi-3-mini Text-to-SQL — LoRA Adapter

A QLoRA adapter that specializes microsoft/Phi-3-mini-4k-instruct (3.8B) for natural-language → SQLite generation over a fixed enterprise schema (departments / employees / products / sales).

  • 🔌 9 MB adapter (0.117% the size of the base model)
  • ⚡ Trained in ~3 minutes within 5.2 GB of GPU memory on a 6 GB laptop GPU (RTX 4050)
  • 🧪 75% execution-match / 100% valid-SQL on held-out questions (up from 41.7% for the base model)
  • 📦 Quantized GGUFs for CPU serving: Bhuvandesai/phi3-text-to-sql-gguf
  • 🖥️ Live demo: Bhuvandesai/phi3-text-to-sql-studio

How to use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = "microsoft/Phi-3-mini-4k-instruct"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, "Bhuvandesai/phi3-text-to-sql-adapter")

SCHEMA = """You are a Text-to-SQL generator. Given a database schema and a natural language
question, write a valid SQLite query. Output only the raw SQL.

Database Schema:
Table departments(id, name, manager_id)
Table employees(id, name, department_id, salary, hire_date, manager_id)
Table products(id, name, category, price)
Table sales(id, employee_id, product_id, amount, quantity, sale_date)"""

msgs = [{"role": "user", "content": f"{SCHEMA}\n\nQuestion: What is the average salary by department?"}]
prompt = tok.apply_chat_template(msgs, tokenize=False, add_generation_prompt=True)
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=128, do_sample=False)
print(tok.decode(out[0], skip_special_tokens=True))

For CPU / no-GPU use, prefer the quantized GGUFs with llama.cpp (see the GGUF repo).

Training

Method QLoRA (4-bit NF4 + double-quant, bf16 compute)
LoRA r=8, α=16, dropout=0.05, bias=none
Trainable params 4,456,448 (0.1165% of 3.82B)
Data 50 train / 12 held-out NL→SQL pairs (synthetic schema)
Schedule 3 epochs, effective batch 4, lr 2e-4 cosine, paged_adamw_8bit
Hardware NVIDIA RTX 4050 Laptop (6 GB)
Runtime / peak VRAM 193.7 s / 5.21 GB reserved

Results (held-out, greedy decoding)

Metric Base Phi-3-mini This adapter
Execution-match (run SQL, compare rows) 41.7% 75.0%
Valid SQL rate 100% 100%
Eval loss (end of training) 0.0597 (−89.9%)
Eval token accuracy 98.4%

Strict execution-match is conservative: 2 of the 3 held-out "misses" are reasonable answers with a different column projection than the reference; counting "query correctly answers the question" ≈ 92%.

Limitations & honest notes

  • Single fixed schema. Trained on one synthetic database; it is not a general cross-schema text-to-SQL model.
  • Small dataset (50/12). Metrics are directional, not statistically tight.
  • LoRA module coverage. Because Phi-3 fuses q/k/v (qkv_proj) and gate/up (gate_up_proj), PEFT name-matching adapted only o_proj and down_proj (2 of the 7 listed modules). It still trained well; a future version should target qkv_proj/gate_up_proj for fuller coverage.

A full write-up (fine-tuning + quantization deep dive with all benchmarks) accompanies this model.

License: MIT (inherits from the base model).

Downloads last month
53
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Bhuvandesai/phi3-text-to-sql-adapter

Adapter
(854)
this model
Quantizations
1 model

Space using Bhuvandesai/phi3-text-to-sql-adapter 1