A newer version of the Gradio SDK is available: 6.15.2
title: Phi-3 Mini SQL Generator
emoji: 🧠
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: SQL generator powered by Phi-3 Mini fine-tuning
Phi-3 Mini SQL Generator
A text-to-SQL app built around a QLoRA fine-tuned Phi-3 Mini model for converting natural-language questions to SQL queries.
What the App Does
The app accepts a table schema plus a natural-language SQL question. It uses deterministic code for schema creation/editing and high-confidence guardrail templates, then calls the fine-tuned model only for supported text-to-SQL generation.
The base model is not live-loaded in the Space. It is kept as offline evaluation evidence so the CPU demo does not pretend to run two 3.8B models interactively.
Model-Generated vs Deterministic
| Behavior | Source |
|---|---|
Non-template SQL_QUERY after schema selection |
Fine-tuned Phi-3 Mini |
| High-confidence ranking/count/aggregation queries | Deterministic SQL template |
CREATE_TABLE from explicit user columns |
Deterministic schema parser |
EDIT_TABLE add/remove/rename columns |
Deterministic schema parser |
| Greetings, off-topic requests, unknown intents | Static fallback |
Boundary rule: deterministic SQL, schema parsers, translations, and fallbacks are not model capability. They are product guardrails around the model.
Training and Evaluation Story
- Base model:
microsoft/Phi-3-mini-4k-instruct - Dataset:
b-mc2/sql-create-context, sampled to 1,000 training examples. - Training method: QLoRA adapter fine-tuning, then merged for simpler CPU inference.
- Published merged model:
Shizu0n/phi3-mini-sql-generator-merged - Primary metric: exact match on held-out text-to-SQL examples.
| Model | Exact match |
|---|---|
| Base Phi-3 Mini | 2.0% |
| Fine-tuned QLoRA merged | 73.5% |
Reported gain: +71.5 percentage points over the base model.
Known limits:
- The training data is mostly
SELECT;INSERT,UPDATE, andDELETEare not a reliable model capability. - Unsupported DML requests are not generated by the model and must not be disguised as schema edits. For example,
delete speciescan remove a schema column, butdelete all animalsreturns a static unsupported-scope fallback. - PT-BR input is not trained model capability. The app handles selected PT-BR cases through deterministic normalization/templates.
- Conversational chat and JSON schema proposal are out of scope and use fallback behavior instead of model prompts.
- Exact match does not prove semantic perfection. It is useful evidence, not a substitute for qualitative review.
AI Engineering Story
The product layer exists because a fine-tuned model is not the same thing as a reliable app.
- Intent routing limits the model path to
SQL_QUERY. - Deterministic schema tools handle explicit create/edit requests without loading the model.
- High-confidence SQL templates handle simple ranking, aggregation, count, and comparison queries before the CPU model path.
- SQL output is validated with
sqlparse; model output is also checked against the activeCREATE TABLEschema before it can be shown as accepted model SQL. - Lazy loading keeps startup cheap; the model is downloaded and loaded only when needed.
- Load and generation timeouts protect the UI from indefinite waits.
- Static fallbacks make unsupported behavior visible instead of laundering it as AI.
- Every user-visible response/status/error with a known source must preserve provenance: fine-tuned model, deterministic SQL template, deterministic schema parser, or static fallback.
How to Use
- Click Load fine-tuned model.
- Loading is lazy: the model is only downloaded and loaded when you request it.
- On CPU, the first load can take a few minutes.
- Select, create, or edit the SQL table schema.
- You can use the presets:
employees,orders,students,products,sales. - You can also ask for explicit schema operations, such as
create table products with id name priceoradd stock.
- You can use the presets:
- Enter the question in the chat input.
- Click Send.
- Review the result in
gr.Code(language="sql")and the source/status message.- The app shows a validation badge powered by
sqlparseplus active-schema checks for model output. - Known-source errors should still identify their source path; unknown-source errors should not fake certainty.
- The app shows a validation badge powered by
Usage Examples
Use this path to show both sides of the project without muddying provenance.
- Model path: load the fine-tuned model, select
employees, and askshow employees in Engineering ordered by salary. - Template path: ask
qual o produto mais caro?with theproductsschema — returns deterministic SQL. - Schema parser: ask
create table animals with id name species weight, thenadd habitat— deterministic schema manipulation. - Fallback: ask
what can you do?— static response since chat is outside model training. - Evaluation: point to the base-vs-fine-tuned exact-match numbers.
Models
- Fine-tuned adapter: Shizu0n/phi3-mini-sql-generator
- Fine-tuned merged model used in the app: Shizu0n/phi3-mini-sql-generator-merged
- Offline baseline model used for evaluation: microsoft/Phi-3-mini-4k-instruct
Current Features
- Gradio UI with a step-by-step flow: load the fine-tuned model, define schema context, and inspect SQL artifacts.
- Intent routing with 5 supported routes:
CREATE_TABLE,EDIT_TABLE,SQL_QUERY,SMALLTALK,UNKNOWN. - Model calls only for non-template
SQL_QUERY; deterministic templates may answer supported SQL shapes before model load. - Smalltalk and unknown messages use a static fallback.
- Lazy loading to reduce startup cost.
- Preserved Phi-3 patches for local/Spaces compatibility.
- Schema presets without blocking manual input.
- SQL output separated from errors/status so booleans, integers, and error messages do not appear inside the SQL block.
- Centered loading overlay to make the loading state obvious.
Model Probe
The normal pytest suite does not load the 3.8B model. To manually verify the real model behavior:
python scripts/model_probe.py
The probe prints JSON with pass/fail checks for static fallback, deterministic CREATE TABLE, deterministic schema edit, SQL query generation, and smalltalk while a schema is active.
Tests
set PYTHONPATH=. && pytest tests/test_chatbot_core.py tests/test_chatbot_behavior.py -q
Current unit suite: 141 tests. These tests avoid loading the 3.8B model and focus on routing, deterministic tools, prompt construction, model-output rejection, active-schema validation, SQL validation, UI schema-context synchronization, and error handling.
Run Locally
python -m pip install -r requirements.txt
python app.py
Then open the address shown by Gradio, usually http://127.0.0.1:7860.