Shizu0n's picture
feat: double-quoted comparison literals for schema validation
35d7c39

A newer version of the Gradio SDK is available: 6.15.2

Upgrade
metadata
title: Phi-3 Mini SQL Generator
emoji: 🧠
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: SQL generator powered by Phi-3 Mini fine-tuning

Phi-3 Mini SQL Generator

A text-to-SQL app built around a QLoRA fine-tuned Phi-3 Mini model for converting natural-language questions to SQL queries.

What the App Does

The app accepts a table schema plus a natural-language SQL question. It uses deterministic code for schema creation/editing and high-confidence guardrail templates, then calls the fine-tuned model only for supported text-to-SQL generation.

The base model is not live-loaded in the Space. It is kept as offline evaluation evidence so the CPU demo does not pretend to run two 3.8B models interactively.

Model-Generated vs Deterministic

Behavior Source
Non-template SQL_QUERY after schema selection Fine-tuned Phi-3 Mini
High-confidence ranking/count/aggregation queries Deterministic SQL template
CREATE_TABLE from explicit user columns Deterministic schema parser
EDIT_TABLE add/remove/rename columns Deterministic schema parser
Greetings, off-topic requests, unknown intents Static fallback

Boundary rule: deterministic SQL, schema parsers, translations, and fallbacks are not model capability. They are product guardrails around the model.

Training and Evaluation Story

Model Exact match
Base Phi-3 Mini 2.0%
Fine-tuned QLoRA merged 73.5%

Reported gain: +71.5 percentage points over the base model.

Known limits:

  • The training data is mostly SELECT; INSERT, UPDATE, and DELETE are not a reliable model capability.
  • Unsupported DML requests are not generated by the model and must not be disguised as schema edits. For example, delete species can remove a schema column, but delete all animals returns a static unsupported-scope fallback.
  • PT-BR input is not trained model capability. The app handles selected PT-BR cases through deterministic normalization/templates.
  • Conversational chat and JSON schema proposal are out of scope and use fallback behavior instead of model prompts.
  • Exact match does not prove semantic perfection. It is useful evidence, not a substitute for qualitative review.

AI Engineering Story

The product layer exists because a fine-tuned model is not the same thing as a reliable app.

  • Intent routing limits the model path to SQL_QUERY.
  • Deterministic schema tools handle explicit create/edit requests without loading the model.
  • High-confidence SQL templates handle simple ranking, aggregation, count, and comparison queries before the CPU model path.
  • SQL output is validated with sqlparse; model output is also checked against the active CREATE TABLE schema before it can be shown as accepted model SQL.
  • Lazy loading keeps startup cheap; the model is downloaded and loaded only when needed.
  • Load and generation timeouts protect the UI from indefinite waits.
  • Static fallbacks make unsupported behavior visible instead of laundering it as AI.
  • Every user-visible response/status/error with a known source must preserve provenance: fine-tuned model, deterministic SQL template, deterministic schema parser, or static fallback.

How to Use

  1. Click Load fine-tuned model.
    • Loading is lazy: the model is only downloaded and loaded when you request it.
    • On CPU, the first load can take a few minutes.
  2. Select, create, or edit the SQL table schema.
    • You can use the presets: employees, orders, students, products, sales.
    • You can also ask for explicit schema operations, such as create table products with id name price or add stock.
  3. Enter the question in the chat input.
  4. Click Send.
  5. Review the result in gr.Code(language="sql") and the source/status message.
    • The app shows a validation badge powered by sqlparse plus active-schema checks for model output.
    • Known-source errors should still identify their source path; unknown-source errors should not fake certainty.

Usage Examples

Use this path to show both sides of the project without muddying provenance.

  1. Model path: load the fine-tuned model, select employees, and ask show employees in Engineering ordered by salary.
  2. Template path: ask qual o produto mais caro? with the products schema — returns deterministic SQL.
  3. Schema parser: ask create table animals with id name species weight, then add habitat — deterministic schema manipulation.
  4. Fallback: ask what can you do? — static response since chat is outside model training.
  5. Evaluation: point to the base-vs-fine-tuned exact-match numbers.

Models

Current Features

  • Gradio UI with a step-by-step flow: load the fine-tuned model, define schema context, and inspect SQL artifacts.
  • Intent routing with 5 supported routes: CREATE_TABLE, EDIT_TABLE, SQL_QUERY, SMALLTALK, UNKNOWN.
  • Model calls only for non-template SQL_QUERY; deterministic templates may answer supported SQL shapes before model load.
  • Smalltalk and unknown messages use a static fallback.
  • Lazy loading to reduce startup cost.
  • Preserved Phi-3 patches for local/Spaces compatibility.
  • Schema presets without blocking manual input.
  • SQL output separated from errors/status so booleans, integers, and error messages do not appear inside the SQL block.
  • Centered loading overlay to make the loading state obvious.

Model Probe

The normal pytest suite does not load the 3.8B model. To manually verify the real model behavior:

python scripts/model_probe.py

The probe prints JSON with pass/fail checks for static fallback, deterministic CREATE TABLE, deterministic schema edit, SQL query generation, and smalltalk while a schema is active.

Tests

set PYTHONPATH=. && pytest tests/test_chatbot_core.py tests/test_chatbot_behavior.py -q

Current unit suite: 141 tests. These tests avoid loading the 3.8B model and focus on routing, deterministic tools, prompt construction, model-output rejection, active-schema validation, SQL validation, UI schema-context synchronization, and error handling.

Run Locally

python -m pip install -r requirements.txt
python app.py

Then open the address shown by Gradio, usually http://127.0.0.1:7860.