Instructions to use Bhuvandesai/phi3-text-to-sql-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Bhuvandesai/phi3-text-to-sql-adapter with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct") model = PeftModel.from_pretrained(base_model, "Bhuvandesai/phi3-text-to-sql-adapter") - Notebooks
- Google Colab
- Kaggle
Phi-3-mini Text-to-SQL — LoRA Adapter
A QLoRA adapter that specializes microsoft/Phi-3-mini-4k-instruct (3.8B) for natural-language → SQLite generation over a fixed enterprise schema (departments / employees / products / sales).
- 🔌 9 MB adapter (0.117% the size of the base model)
- ⚡ Trained in ~3 minutes within 5.2 GB of GPU memory on a 6 GB laptop GPU (RTX 4050)
- 🧪 75% execution-match / 100% valid-SQL on held-out questions (up from 41.7% for the base model)
- 📦 Quantized GGUFs for CPU serving:
Bhuvandesai/phi3-text-to-sql-gguf - 🖥️ Live demo:
Bhuvandesai/phi3-text-to-sql-studio
How to use
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = "microsoft/Phi-3-mini-4k-instruct"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, "Bhuvandesai/phi3-text-to-sql-adapter")
SCHEMA = """You are a Text-to-SQL generator. Given a database schema and a natural language
question, write a valid SQLite query. Output only the raw SQL.
Database Schema:
Table departments(id, name, manager_id)
Table employees(id, name, department_id, salary, hire_date, manager_id)
Table products(id, name, category, price)
Table sales(id, employee_id, product_id, amount, quantity, sale_date)"""
msgs = [{"role": "user", "content": f"{SCHEMA}\n\nQuestion: What is the average salary by department?"}]
prompt = tok.apply_chat_template(msgs, tokenize=False, add_generation_prompt=True)
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=128, do_sample=False)
print(tok.decode(out[0], skip_special_tokens=True))
For CPU / no-GPU use, prefer the quantized GGUFs with llama.cpp (see the GGUF repo).
Training
| Method | QLoRA (4-bit NF4 + double-quant, bf16 compute) |
| LoRA | r=8, α=16, dropout=0.05, bias=none |
| Trainable params | 4,456,448 (0.1165% of 3.82B) |
| Data | 50 train / 12 held-out NL→SQL pairs (synthetic schema) |
| Schedule | 3 epochs, effective batch 4, lr 2e-4 cosine, paged_adamw_8bit |
| Hardware | NVIDIA RTX 4050 Laptop (6 GB) |
| Runtime / peak VRAM | 193.7 s / 5.21 GB reserved |
Results (held-out, greedy decoding)
| Metric | Base Phi-3-mini | This adapter |
|---|---|---|
| Execution-match (run SQL, compare rows) | 41.7% | 75.0% |
| Valid SQL rate | 100% | 100% |
| Eval loss (end of training) | — | 0.0597 (−89.9%) |
| Eval token accuracy | — | 98.4% |
Strict execution-match is conservative: 2 of the 3 held-out "misses" are reasonable answers with a different column projection than the reference; counting "query correctly answers the question" ≈ 92%.
Limitations & honest notes
- Single fixed schema. Trained on one synthetic database; it is not a general cross-schema text-to-SQL model.
- Small dataset (50/12). Metrics are directional, not statistically tight.
- LoRA module coverage. Because Phi-3 fuses
q/k/v(qkv_proj) and gate/up (gate_up_proj), PEFT name-matching adapted onlyo_projanddown_proj(2 of the 7 listed modules). It still trained well; a future version should targetqkv_proj/gate_up_projfor fuller coverage.
A full write-up (fine-tuning + quantization deep dive with all benchmarks) accompanies this model.
License: MIT (inherits from the base model).
- Downloads last month
- 53